API Reference¶
This page contains the API documentation for all Python modules in the codebase (excluding init.py files).
aiperf.cli¶
Main CLI entry point for the AIPerf system.
analyze(user_config, service_config=None)
¶
Sweep through one or more parameters.
Source code in aiperf/cli.py
42 43 44 45 46 47 48 49 50 51 | |
create_template(template_filename=CLIDefaults.TEMPLATE_FILENAME)
¶
Create a template configuration file.
Source code in aiperf/cli.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | |
profile(user_config, service_config=None)
¶
Run the Profile subcommand.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
user_config
|
UserConfig
|
User configuration for the benchmark |
required |
service_config
|
ServiceConfig | None
|
Service configuration options |
None
|
Source code in aiperf/cli.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | |
validate_config(user_config=None, service_config=None)
¶
Validate the configuration file.
Source code in aiperf/cli.py
73 74 75 76 77 78 79 80 81 82 | |
aiperf.cli_runner¶
run_system_controller(user_config, service_config)
¶
Run the system controller with the given configuration.
Source code in aiperf/cli_runner.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
warn_command_not_implemented(command)
¶
Warn the user that the subcommand is not implemented.
Source code in aiperf/cli_runner.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
aiperf.clients.http.aiohttp_client¶
AioHttpClientMixin
¶
A high-performance HTTP client for communicating with HTTP based REST APIs using aiohttp.
This class is optimized for maximum performance and accurate timing measurements, making it ideal for benchmarking scenarios.
Source code in aiperf/clients/http/aiohttp_client.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | |
close()
async
¶
Close the client.
Source code in aiperf/clients/http/aiohttp_client.py
50 51 52 53 54 | |
post_request(url, payload, headers, **kwargs)
async
¶
Send a streaming or non-streaming POST request to the specified URL with the given payload and headers.
If the response is an SSE stream, the response will be parsed into a list of SSE messages. Otherwise, the response will be parsed into a TextResponse object.
Source code in aiperf/clients/http/aiohttp_client.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | |
AioHttpSSEStreamReader
¶
A helper class for reading an SSE stream from an aiohttp.ClientResponse object.
This class is optimized for maximum performance and accurate timing measurements, making it ideal for benchmarking scenarios.
Source code in aiperf/clients/http/aiohttp_client.py
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 | |
__aiter__()
async
¶
Iterate over the SSE stream in a performant manner and return a tuple of the raw SSE message, the perf_counter_ns of the first byte, and the perf_counter_ns of the last byte. This provides the most accurate timing information possible without any delays due to the nature of the aiohttp library. The first byte is read immediately to capture the timestamp of the first byte, and the last byte is read after the rest of the chunk is read to capture the timestamp of the last byte.
Returns:
| Type | Description |
|---|---|
AsyncIterator[tuple[str, int]]
|
An async iterator of tuples of the raw SSE message, and the perf_counter_ns of the first byte |
Source code in aiperf/clients/http/aiohttp_client.py
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 | |
read_complete_stream()
async
¶
Read the complete SSE stream in a performant manner and return a list of SSE messages that contain the most accurate timestamp data possible.
Returns:
| Type | Description |
|---|---|
list[SSEMessage]
|
A list of SSE messages. |
Source code in aiperf/clients/http/aiohttp_client.py
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | |
create_tcp_connector(**kwargs)
¶
Create a new connector with the given configuration.
Source code in aiperf/clients/http/aiohttp_client.py
224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 | |
parse_sse_message(raw_message, perf_ns)
¶
Parse a raw SSE message into an SSEMessage object.
Parsing logic based on official HTML SSE Living Standard: https://html.spec.whatwg.org/multipage/server-sent-events.html#parsing-an-event-stream
Source code in aiperf/clients/http/aiohttp_client.py
195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 | |
aiperf.clients.http.defaults¶
AioHttpDefaults
dataclass
¶
Default values for aiohttp.ClientSession.
Source code in aiperf/clients/http/defaults.py
62 63 64 65 66 67 68 69 70 71 72 73 74 | |
SocketDefaults
dataclass
¶
Default values for socket options.
Source code in aiperf/clients/http/defaults.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | |
apply_to_socket(sock)
classmethod
¶
Apply the default socket options to the given socket.
Source code in aiperf/clients/http/defaults.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | |
aiperf.clients.model_endpoint_info¶
Model endpoint information.
This module contains the pydantic models that encapsulate the information needed to send requests to an inference server, primarily around the model, endpoint, and additional request payload information.
EndpointInfo
¶
Bases: AIPerfBaseModel
Information about an endpoint.
Source code in aiperf/clients/model_endpoint_info.py
63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
from_user_config(user_config)
classmethod
¶
Create an HttpEndpointInfo from a UserConfig.
Source code in aiperf/clients/model_endpoint_info.py
108 109 110 111 112 113 114 115 116 117 118 119 120 | |
ModelEndpointInfo
¶
Bases: AIPerfBaseModel
Information about a model endpoint.
Source code in aiperf/clients/model_endpoint_info.py
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | |
primary_model
property
¶
Get the primary model.
primary_model_name
property
¶
Get the primary model name.
url
property
¶
Get the full URL for the endpoint.
from_user_config(user_config)
classmethod
¶
Create a ModelEndpointInfo from a UserConfig.
Source code in aiperf/clients/model_endpoint_info.py
135 136 137 138 139 140 141 | |
ModelInfo
¶
Bases: AIPerfBaseModel
Information about a model.
Source code in aiperf/clients/model_endpoint_info.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
ModelListInfo
¶
Bases: AIPerfBaseModel
Information about a list of models.
Source code in aiperf/clients/model_endpoint_info.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | |
from_user_config(user_config)
classmethod
¶
Create a ModelListInfo from a UserConfig.
Source code in aiperf/clients/model_endpoint_info.py
52 53 54 55 56 57 58 59 60 | |
aiperf.clients.openai.openai_aiohttp¶
OpenAIClientAioHttp
¶
Bases: AioHttpClientMixin, AIPerfLoggerMixin, ABC
Inference client for OpenAI based requests using aiohttp.
Source code in aiperf/clients/openai/openai_aiohttp.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | |
get_headers(model_endpoint)
¶
Get the headers for the given endpoint.
Source code in aiperf/clients/openai/openai_aiohttp.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
get_url(model_endpoint)
¶
Get the URL for the given endpoint.
Source code in aiperf/clients/openai/openai_aiohttp.py
50 51 52 53 54 55 | |
send_request(model_endpoint, payload)
async
¶
Send OpenAI request using aiohttp.
Source code in aiperf/clients/openai/openai_aiohttp.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | |
aiperf.clients.openai.openai_chat¶
OpenAIChatCompletionRequestConverter
¶
Bases: AIPerfLoggerMixin
Request converter for OpenAI chat completion requests.
Source code in aiperf/clients/openai/openai_chat.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | |
format_payload(model_endpoint, turn)
async
¶
Format payload for a chat completion request.
Source code in aiperf/clients/openai/openai_chat.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
aiperf.clients.openai.openai_completions¶
OpenAICompletionRequestConverter
¶
Bases: AIPerfLoggerMixin
Request converter for OpenAI completion requests.
Source code in aiperf/clients/openai/openai_completions.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | |
format_payload(model_endpoint, turn)
async
¶
Format payload for a completion request.
Source code in aiperf/clients/openai/openai_completions.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | |
aiperf.clients.openai.openai_embeddings¶
OpenAIEmbeddingsRequestConverter
¶
Bases: AIPerfLoggerMixin
Request converter for OpenAI embeddings requests.
Source code in aiperf/clients/openai/openai_embeddings.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | |
format_payload(model_endpoint, turn)
async
¶
Format payload for an embeddings request.
Source code in aiperf/clients/openai/openai_embeddings.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | |
aiperf.clients.openai.openai_responses¶
OpenAIResponsesRequestConverter
¶
Bases: AIPerfLoggerMixin
Request converter for OpenAI Responses requests.
Source code in aiperf/clients/openai/openai_responses.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
format_payload(model_endpoint, turn)
async
¶
Format payload for a responses request.
Source code in aiperf/clients/openai/openai_responses.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
aiperf.common.aiperf_logger¶
AIPerfLogger
¶
Logger for AIPerf messages with lazy evaluation support for f-strings.
This logger supports lazy evaluation of f-strings through lambdas to avoid expensive string formatting operations when the log level is not enabled.
It also extends the standard logging module with additional log levels
- TRACE (TRACE < DEBUG)
- NOTICE (INFO < NOTICE < WARNING)
- SUCCESS (WARNING < SUCCESS < ERROR)
Usage
logger = AIPerfLogger("my_logger") logger.debug(lambda: f"Processing {item} with {count} items") logger.info("Simple string message") logger.notice("Notice message") logger.success("Benchmark completed successfully")
Need to pass local variables to the lambda to avoid them going out of scope¶
logger.debug(lambda i=i: f"Binding loop variable: {i}") logger.exception(f"Direct f-string usage: {e}")
Source code in aiperf/common/aiperf_logger.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 | |
critical(msg, *args, **kwargs)
¶
Log a critical message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
210 211 212 213 | |
debug(msg, *args, **kwargs)
¶
Log a debug message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
175 176 177 178 | |
error(msg, *args, **kwargs)
¶
Log an error message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
200 201 202 203 | |
exception(msg, *args, **kwargs)
¶
Log an exception message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
205 206 207 208 | |
find_caller(stack_info=False, stacklevel=1)
¶
NOTE: This is a modified version of the findCaller method in the logging module, in order to allow us to add custom ignored files.
Find the stack frame of the caller so that we can note the source file name, line number and function name.
Source code in aiperf/common/aiperf_logger.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 | |
get_level_number(level)
classmethod
¶
Get the numeric level for the given level.
Source code in aiperf/common/aiperf_logger.py
114 115 116 117 118 119 120 | |
info(msg, *args, **kwargs)
¶
Log an info message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
180 181 182 183 | |
is_valid_level(level)
classmethod
¶
Check if the given level is a valid level.
Source code in aiperf/common/aiperf_logger.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | |
log(level, msg, *args, **kwargs)
¶
Log a message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
165 166 167 168 | |
notice(msg, *args, **kwargs)
¶
Log a notice message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
185 186 187 188 | |
success(msg, *args, **kwargs)
¶
Log a success message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
195 196 197 198 | |
trace(msg, *args, **kwargs)
¶
Log a trace message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
170 171 172 173 | |
warning(msg, *args, **kwargs)
¶
Log a warning message with support for lazy evaluation using lambdas.
Source code in aiperf/common/aiperf_logger.py
190 191 192 193 | |
aiperf.common.bootstrap¶
bootstrap_and_run_service(service_class, service_config=None, user_config=None, service_id=None, log_queue=None, **kwargs)
¶
Bootstrap the service and run it.
This function will load the service configuration, create an instance of the service, and run it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
service_class
|
type[ServiceProtocol]
|
The python class of the service to run. This should be a subclass of BaseService. This should be a type and not an instance. |
required |
service_config
|
ServiceConfig | None
|
The service configuration to use. If not provided, the service configuration will be loaded from the environment variables. |
None
|
user_config
|
UserConfig | None
|
The user configuration to use. If not provided, the user configuration will be loaded from the environment variables. |
None
|
log_queue
|
Queue | None
|
Optional multiprocessing queue for child process logging. If provided, the child process logging will be set up. |
None
|
kwargs
|
Additional keyword arguments to pass to the service constructor. |
{}
|
Source code in aiperf/common/bootstrap.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | |
aiperf.common.comms.base_comms¶
BaseCommunication
¶
Bases: AIPerfLifecycleMixin, ABC
Base class for specifying the base communication layer for AIPerf components.
Source code in aiperf/common/comms/base_comms.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | |
create_client(client_type, address, bind=False, socket_ops=None, max_pull_concurrency=None)
abstractmethod
¶
Create a communication client for a given client type and address.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client_type
|
CommClientType
|
The type of client to create. |
required |
address
|
CommAddressType
|
The type of address to use when looking up in the communication config, or the address itself. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
False
|
socket_ops
|
dict | None
|
Additional socket options to set. |
None
|
max_pull_concurrency
|
int | None
|
The maximum number of concurrent pull requests to allow. (Only used for pull clients) |
None
|
Source code in aiperf/common/comms/base_comms.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
get_address(address_type)
abstractmethod
¶
Get the address for a given address type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address_type
|
CommAddressType
|
The type of address to get the address for, or the address itself. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The address for the given address type, or the address itself if it is a string. |
Source code in aiperf/common/comms/base_comms.py
26 27 28 29 30 31 32 33 34 35 | |
aiperf.common.comms.zmq.dealer_request_client¶
ZMQDealerRequestClient
¶
Bases: BaseZMQClient, TaskManagerMixin
ZMQ DEALER socket client for asynchronous request-response communication.
The DEALER socket connects to ROUTER sockets and can send requests asynchronously, receiving responses through callbacks or awaitable futures.
ASCII Diagram: ┌──────────────┐ ┌──────────────┐ │ DEALER │───── Request ─────>│ ROUTER │ │ (Client) │ │ (Service) │ │ │<─── Response ──────│ │ └──────────────┘ └──────────────┘
Usage Pattern: - DEALER Clients send requests to ROUTER Services - Responses are routed back to the originating DEALER
DEALER/ROUTER is a Many-to-One communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQDealerRouterProxy for more details.
Source code in aiperf/common/comms/zmq/dealer_request_client.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | |
__init__(address, bind, socket_ops=None, **kwargs)
¶
Initialize the ZMQ Dealer (Req) client class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/common/comms/zmq/dealer_request_client.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | |
request(message, timeout=DEFAULT_COMMS_REQUEST_TIMEOUT)
async
¶
Send a request and wait for a response up to timeout seconds.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
Message
|
The request message to send. |
required |
timeout
|
float
|
Maximum time to wait for a response in seconds. |
DEFAULT_COMMS_REQUEST_TIMEOUT
|
Returns:
| Name | Type | Description |
|---|---|---|
Message |
Message
|
The response message received. |
Raises:
| Type | Description |
|---|---|
CommunicationError
|
if the request fails, or |
TimeoutError
|
if the response is not received in time. |
Source code in aiperf/common/comms/zmq/dealer_request_client.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | |
request_async(message, callback)
async
¶
Send a request and be notified when the response is received.
Source code in aiperf/common/comms/zmq/dealer_request_client.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |
aiperf.common.comms.zmq.pub_client¶
ZMQPubClient
¶
Bases: BaseZMQClient
The PUB socket broadcasts messages to all connected SUB sockets that have subscribed to the message topic/type.
ASCII Diagram: ┌──────────────┐ ┌──────────────┐ │ PUB │───>│ │ │ (Publisher) │ │ │ └──────────────┘ │ SUB │ ┌──────────────┐ │ (Subscriber) │ │ PUB │───>│ │ │ (Publisher) │ │ │ └──────────────┘ └──────────────┘ OR ┌──────────────┐ ┌──────────────┐ │ │───>│ SUB │ │ │ │ (Subscriber) │ │ PUB │ └──────────────┘ │ (Publisher) │ ┌──────────────┐ │ │───>│ SUB │ │ │ │ (Subscriber) │ └──────────────┘ └──────────────┘
Usage Pattern: - Single PUB socket broadcasts messages to all subscribers (One-to-Many) OR - Multiple PUB sockets broadcast messages to a single SUB socket (Many-to-One)
- SUB sockets filter messages by topic/message_type
- Fire-and-forget messaging (no acknowledgments)
PUB/SUB is a One-to-Many communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQXPubXSubProxy for more details.
Source code in aiperf/common/comms/zmq/pub_client.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | |
__init__(address, bind, socket_ops=None, **kwargs)
¶
Initialize the ZMQ Publisher client class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/common/comms/zmq/pub_client.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | |
publish(message)
async
¶
Publish a message. The topic will be set automatically based on the message type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
Message
|
Message to publish (must be a Message object) |
required |
Source code in aiperf/common/comms/zmq/pub_client.py
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
aiperf.common.comms.zmq.pull_client¶
ZMQPullClient
¶
Bases: BaseZMQClient
ZMQ PULL socket client for receiving work from PUSH sockets.
The PULL socket receives messages from PUSH sockets in a pipeline pattern, distributing work fairly among multiple PULL workers.
ASCII Diagram: ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ PUSH │ │ PULL │ │ PULL │ │ (Producer) │ │ (Worker 1) │ │ (Worker 2) │ │ │ └─────────────┘ └─────────────┘ │ Tasks: │ ▲ ▲ │ - Task A │─────────────┘ │ │ - Task B │───────────────────────────────────┘ │ - Task C │─────────────┐ │ - Task D │ ▼ └─────────────┘ ┌─────────────┐ │ PULL │ │ (Worker N) │ └─────────────┘
Usage Pattern: - PULL receives work from multiple PUSH producers - Work is fairly distributed among PULL workers - Pipeline pattern for distributed processing - Each message is delivered to exactly one PULL socket
PULL/PUSH is a One-to-Many communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQPushPullProxy for more details.
Source code in aiperf/common/comms/zmq/pull_client.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |
__init__(address, bind, socket_ops=None, max_pull_concurrency=None, **kwargs)
¶
Initialize the ZMQ Puller class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
max_pull_concurrency
|
int
|
The maximum number of concurrent requests to allow. |
None
|
Source code in aiperf/common/comms/zmq/pull_client.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
register_pull_callback(message_type, callback)
¶
Register a ZMQ Pull data callback for a given message type.
Note that only one callback can be registered for a given message type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message_type
|
MessageTypeT
|
The message type to register the callback for. |
required |
callback
|
Callable[[Message], Coroutine[Any, Any, None]]
|
The function to call when data is received. |
required |
Raises: CommunicationError: If the client is not initialized
Source code in aiperf/common/comms/zmq/pull_client.py
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |
aiperf.common.comms.zmq.push_client¶
MAX_PUSH_RETRIES = 2
module-attribute
¶
Maximum number of retries for pushing a message.
RETRY_DELAY_INTERVAL_SEC = 0.1
module-attribute
¶
The interval to wait before retrying to push a message.
ZMQPushClient
¶
Bases: BaseZMQClient
ZMQ PUSH socket client for sending work to PULL sockets.
The PUSH socket sends messages to PULL sockets in a pipeline pattern, distributing work fairly among available PULL workers.
ASCII Diagram: ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ PUSH │ │ PULL │ │ PULL │ │ (Producer) │ │ (Worker 1) │ │ (Worker 2) │ │ │ └─────────────┘ └─────────────┘ │ Tasks: │ ▲ ▲ │ - Task A │─────────────┘ │ │ - Task B │───────────────────────────────────┘ │ - Task C │─────────────┐ │ - Task D │ ▼ └─────────────┘ ┌─────────────┐ │ PULL │ │ (Worker 3) │ └─────────────┘
Usage Pattern: - Round-robin distribution of work tasks (One-to-Many) - Each message delivered to exactly one worker - Pipeline pattern for distributed processing - Automatic load balancing across available workers
PUSH/PULL is a One-to-Many communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQPushPullProxy for more details.
Source code in aiperf/common/comms/zmq/push_client.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | |
__init__(address, bind, socket_ops=None, **kwargs)
¶
Initialize the ZMQ Push client class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/common/comms/zmq/push_client.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | |
push(message)
async
¶
Push data to a target. The message will be routed automatically based on the message type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
Message
|
Message to be sent must be a Message object |
required |
Source code in aiperf/common/comms/zmq/push_client.py
106 107 108 109 110 111 112 113 114 115 | |
aiperf.common.comms.zmq.router_reply_client¶
ZMQRouterReplyClient
¶
Bases: BaseZMQClient
ZMQ ROUTER socket client for handling requests from DEALER clients.
The ROUTER socket receives requests from DEALER clients and sends responses back to the originating DEALER client using routing envelopes.
ASCII Diagram: ┌──────────────┐ ┌──────────────┐ │ DEALER │───── Request ─────>│ │ │ (Client) │<──── Response ─────│ │ └──────────────┘ │ │ ┌──────────────┐ │ ROUTER │ │ DEALER │───── Request ─────>│ (Service) │ │ (Client) │<──── Response ─────│ │ └──────────────┘ │ │ ┌──────────────┐ │ │ │ DEALER │───── Request ─────>│ │ │ (Client) │<──── Response ─────│ │ └──────────────┘ └──────────────┘
Usage Pattern: - ROUTER handles requests from multiple DEALER clients - Maintains routing envelopes to send responses back - Many-to-one request handling pattern - Supports concurrent request processing
ROUTER/DEALER is a Many-to-One communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQDealerRouterProxy for more details.
Source code in aiperf/common/comms/zmq/router_reply_client.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | |
__init__(address, bind, socket_ops=None, **kwargs)
¶
Initialize the ZMQ Router (Rep) client class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/common/comms/zmq/router_reply_client.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | |
register_request_handler(service_id, message_type, handler)
¶
Register a request handler. Anytime a request is received that matches the message type, the handler will be called. The handler should return a response message. If the handler returns None, the request will be ignored.
Note that there is a limit of 1 to 1 mapping between message type and handler.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
service_id
|
str
|
The service ID to register the handler for |
required |
message_type
|
MessageTypeT
|
The message type to register the handler for |
required |
handler
|
Callable[[Message], Coroutine[Any, Any, Message | None]]
|
The handler to register |
required |
Source code in aiperf/common/comms/zmq/router_reply_client.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | |
aiperf.common.comms.zmq.sub_client¶
ZMQSubClient
¶
Bases: BaseZMQClient
ZMQ SUB socket client for subscribing to messages from PUB sockets. One-to-Many or Many-to-One communication pattern.
ASCII Diagram: ┌──────────────┐ ┌──────────────┐ │ PUB │───>│ │ │ (Publisher) │ │ │ └──────────────┘ │ SUB │ ┌──────────────┐ │ (Subscriber) │ │ PUB │───>│ │ │ (Publisher) │ │ │ └──────────────┘ └──────────────┘ OR ┌──────────────┐ ┌──────────────┐ │ │───>│ SUB │ │ │ │ (Subscriber) │ │ PUB │ └──────────────┘ │ (Publisher) │ ┌──────────────┐ │ │───>│ SUB │ │ │ │ (Subscriber) │ └──────────────┘ └──────────────┘
Usage Pattern: - Single SUB socket subscribes to multiple PUB publishers (One-to-Many) OR - Multiple SUB sockets subscribe to a single PUB publisher (Many-to-One)
- Subscribes to specific message topics/types
- Receives all messages matching subscriptions
SUB/PUB is a One-to-Many communication pattern. If you need Many-to-Many,
use a ZMQ Proxy as well. see :class:ZMQXPubXSubProxy for more details.
Source code in aiperf/common/comms/zmq/sub_client.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | |
__init__(address, bind, socket_ops=None, **kwargs)
¶
Initialize the ZMQ Subscriber class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/common/comms/zmq/sub_client.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | |
subscribe(message_type, callback)
async
¶
Subscribe to a message_type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message_type
|
MessageTypeT
|
MessageTypeT to subscribe to |
required |
callback
|
Callable[[Message], Any]
|
Function to call when a message is received (receives Message object) |
required |
Source code in aiperf/common/comms/zmq/sub_client.py
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | |
subscribe_all(message_callback_map)
async
¶
Subscribe to all message_types in the map. For each MessageType, a single callback or a list of callbacks can be provided.
Source code in aiperf/common/comms/zmq/sub_client.py
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
aiperf.common.comms.zmq.zmq_base_client¶
BaseZMQClient
¶
Bases: AIPerfLifecycleMixin
Base class for all ZMQ clients. It can be used as-is to create a new ZMQ client, or it can be subclassed to create specific ZMQ client functionality.
It inherits from the :class:AIPerfLifecycleMixin, allowing derived
classes to implement specific hooks.
Source code in aiperf/common/comms/zmq/zmq_base_client.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | |
socket_type_name
property
¶
Get the name of the socket type.
__init__(socket_type, address, bind, socket_ops=None, client_id=None, **kwargs)
¶
Initialize the ZMQ Base class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
address
|
str
|
The address to bind or connect to. |
required |
bind
|
bool
|
Whether to BIND or CONNECT the socket. |
required |
socket_type
|
SocketType
|
The type of ZMQ socket (eg. PUB, SUB, ROUTER, DEALER, etc.). |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
Source code in aiperf/common/comms/zmq/zmq_base_client.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
aiperf.common.comms.zmq.zmq_comms¶
BaseZMQCommunication
¶
Bases: BaseCommunication, AIPerfLoggerMixin, ABC
ZeroMQ-based implementation of the CommunicationProtocol.
Uses ZeroMQ for publish/subscribe, request/reply, and pull/push patterns to facilitate communication between AIPerf components.
Source code in aiperf/common/comms/zmq/zmq_comms.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
create_client(client_type, address, bind=False, socket_ops=None, max_pull_concurrency=None, **kwargs)
¶
Create a communication client for a given client type and address.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client_type
|
CommClientType
|
The type of client to create. |
required |
address
|
CommAddressType
|
The type of address to use when looking up in the communication config, or the address itself. |
required |
bind
|
bool
|
Whether to bind or connect the socket. |
False
|
socket_ops
|
dict | None
|
Additional socket options to set. |
None
|
max_pull_concurrency
|
int | None
|
The maximum number of concurrent pull requests to allow. (Only used for pull clients) |
None
|
Source code in aiperf/common/comms/zmq/zmq_comms.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
get_address(address_type)
¶
Get the actual address based on the address type from the config.
Source code in aiperf/common/comms/zmq/zmq_comms.py
54 55 56 57 58 | |
ZMQIPCCommunication
¶
Bases: BaseZMQCommunication
ZeroMQ-based implementation of the Communication interface using IPC transport.
Source code in aiperf/common/comms/zmq/zmq_comms.py
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | |
__init__(config=None)
¶
Initialize ZMQ IPC communication.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
ZMQIPCConfig | None
|
ZMQIPCConfig object with configuration parameters |
None
|
Source code in aiperf/common/comms/zmq/zmq_comms.py
122 123 124 125 126 127 128 129 130 | |
ZMQTCPCommunication
¶
Bases: BaseZMQCommunication
ZeroMQ-based implementation of the Communication interface using TCP transport.
Source code in aiperf/common/comms/zmq/zmq_comms.py
103 104 105 106 107 108 109 110 111 112 113 114 | |
__init__(config=None)
¶
Initialize ZMQ TCP communication.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
ZMQTCPConfig | None
|
ZMQTCPTransportConfig object with configuration parameters |
None
|
Source code in aiperf/common/comms/zmq/zmq_comms.py
108 109 110 111 112 113 114 | |
aiperf.common.comms.zmq.zmq_defaults¶
ZMQSocketDefaults
¶
Default values for ZMQ sockets.
Source code in aiperf/common/comms/zmq/zmq_defaults.py
5 6 7 8 9 10 11 12 13 14 15 16 | |
aiperf.common.comms.zmq.zmq_proxy_base¶
BaseZMQProxy
¶
Bases: AIPerfLifecycleMixin, ABC
A Base ZMQ Proxy class.
- Frontend and backend sockets forward messages bidirectionally
- Frontend and Backend sockets both BIND
- Multiple clients CONNECT to
frontend_address - Multiple services CONNECT to
backend_address - Control: Optional REP socket for proxy commands (start/stop/pause) - not implemented yet
- Monitoring: Optional PUB socket that broadcasts copies of all forwarded messages - not implemented yet
- Proxy runs in separate thread to avoid blocking main event loop
Source code in aiperf/common/comms/zmq/zmq_proxy_base.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 | |
__init__(frontend_socket_class, backend_socket_class, zmq_proxy_config, socket_ops=None, proxy_uuid=None)
¶
Initialize the ZMQ Proxy. This is a base class for all ZMQ Proxies.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
frontend_socket_class
|
type[BaseZMQClient]
|
The frontend socket class. |
required |
backend_socket_class
|
type[BaseZMQClient]
|
The backend socket class. |
required |
zmq_proxy_config
|
BaseZMQProxyConfig
|
The ZMQ proxy configuration. |
required |
socket_ops
|
dict
|
Additional socket options to set. |
None
|
proxy_uuid
|
str
|
An optional UUID for the proxy instance. If not provided, a new UUID will be generated. This is useful for tracing and debugging purposes. |
None
|
Source code in aiperf/common/comms/zmq/zmq_proxy_base.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | |
from_config(config, socket_ops=None)
abstractmethod
classmethod
¶
Create a BaseZMQProxy from a BaseZMQProxyConfig, or None if not provided.
Source code in aiperf/common/comms/zmq/zmq_proxy_base.py
138 139 140 141 142 143 144 145 146 | |
ProxySocketClient
¶
Bases: BaseZMQClient
A ZMQ Proxy socket client class that extends BaseZMQClient.
This class is used to create proxy sockets for the frontend, backend, capture, and control endpoint types of a ZMQ Proxy.
Source code in aiperf/common/comms/zmq/zmq_proxy_base.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |
aiperf.common.comms.zmq.zmq_proxy_sockets¶
ZMQDealerRouterProxy = define_proxy_class(ZMQProxyType.DEALER_ROUTER, create_proxy_socket_class(SocketType.ROUTER, ProxyEndType.Frontend), create_proxy_socket_class(SocketType.DEALER, ProxyEndType.Backend))
module-attribute
¶
A ROUTER socket for the proxy's frontend and a DEALER socket for the proxy's backend.
ASCII Diagram: ┌───────────┐ ┌──────────────────────────────────┐ ┌───────────┐ │ DEALER │<───>│ PROXY │<────>│ ROUTER │ │ Client 1 │ │ ┌──────────┐ ┌──────────┐ │ │ Service 1 │ └───────────┘ │ │ ROUTER │<─────> │ DEALER │ │ └───────────┘ ┌───────────┐ │ │ Frontend │ │ Backend │ │ ┌───────────┐ │ DEALER │<───>│ └──────────┘ └──────────┘ │<────>│ ROUTER │ │ Client N │ └──────────────────────────────────┘ │ Service N │ └───────────┘ └───────────┘
The ROUTER frontend socket receives messages from DEALER clients and forwards them through the proxy to ROUTER services. The ZMQ proxy handles the message routing automatically.
The DEALER backend socket receives messages from ROUTER services and forwards them through the proxy to DEALER clients. The ZMQ proxy handles the message routing automatically.
CRITICAL: This socket must NOT have an identity when used in a proxy configuration, as it needs to be transparent to preserve routing envelopes for proper response forwarding back to original DEALER clients.
ZMQPushPullProxy = define_proxy_class(ZMQProxyType.PUSH_PULL, create_proxy_socket_class(SocketType.PULL, ProxyEndType.Frontend), create_proxy_socket_class(SocketType.PUSH, ProxyEndType.Backend))
module-attribute
¶
A PULL socket for the proxy's frontend and a PUSH socket for the proxy's backend.
ASCII Diagram: ┌───────────┐ ┌─────────────────────────────────┐ ┌───────────┐ │ PUSH │─────>│ PROXY │─────>│ PULL │ │ Client 1 │ │ ┌──────────┐ ┌──────────┐ │ │ Service 1 │ └───────────┘ │ │ PULL │──────>│ PUSH │ │ └───────────┘ ┌───────────┐ │ │ Frontend │ │ Backend │ │ ┌───────────┐ │ PUSH │─────>│ └──────────┘ └──────────┘ │─────>│ PULL │ │ Client N │ └─────────────────────────────────┘ │ Service N │ └───────────┘ └───────────┘
The PULL frontend socket receives messages from PUSH clients and forwards them through the proxy to PUSH services. The ZMQ proxy handles the message routing automatically.
The PUSH backend socket forwards messages from the proxy to PULL services. The ZMQ proxy handles the message routing automatically.
ZMQXPubXSubProxy = define_proxy_class(ZMQProxyType.XPUB_XSUB, create_proxy_socket_class(SocketType.XSUB, ProxyEndType.Frontend), create_proxy_socket_class(SocketType.XPUB, ProxyEndType.Backend))
module-attribute
¶
An XSUB socket for the proxy's frontend and an XPUB socket for the proxy's backend.
ASCII Diagram: ┌───────────┐ ┌─────────────────────────────────┐ ┌───────────┐ │ PUB │───>│ PROXY │───>│ SUB │ │ Client 1 │ │ ┌──────────┐ ┌──────────┐ │ │ Service 1 │ └───────────┘ │ │ XSUB │──────>│ XPUB │ │ └───────────┘ ┌───────────┐ │ │ Frontend │ │ Backend │ │ ┌───────────┐ │ PUB │───>│ └──────────┘ └──────────┘ │───>│ SUB │ │ Client N │ └─────────────────────────────────┘ │ Service N │ └───────────┘ └───────────┘
The XSUB frontend socket receives messages from PUB clients and forwards them through the proxy to XPUB services. The ZMQ proxy handles the message routing automatically.
The XPUB backend socket forwards messages from the proxy to SUB services. The ZMQ proxy handles the message routing automatically.
create_proxy_socket_class(socket_type, end_type)
¶
Create a proxy socket class using the specified socket type. This is used to reduce the boilerplate code required to create a ZMQ Proxy class.
Source code in aiperf/common/comms/zmq/zmq_proxy_sockets.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
define_proxy_class(proxy_type, frontend_socket_class, backend_socket_class)
¶
This function reduces the boilerplate code required to create a ZMQ Proxy class. It will generate a ZMQ Proxy class and register it with the ZMQProxyFactory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
proxy_type
|
ZMQProxyType
|
The type of proxy to generate. |
required |
frontend_socket_class
|
type[BaseZMQClient]
|
The class of the frontend socket. |
required |
backend_socket_class
|
type[BaseZMQClient]
|
The class of the backend socket. |
required |
Source code in aiperf/common/comms/zmq/zmq_proxy_sockets.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | |
aiperf.common.config.audio_config¶
AudioConfig
¶
Bases: BaseConfig
A configuration class for defining audio related settings.
Source code in aiperf/common/config/audio_config.py
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | |
AudioLengthConfig
¶
Bases: BaseConfig
A configuration class for defining audio length related settings.
Source code in aiperf/common/config/audio_config.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
aiperf.common.config.base_config¶
BaseConfig
¶
Bases: AIPerfBaseModel
Base configuration class for all configurations.
Source code in aiperf/common/config/base_config.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 | |
serialize_to_yaml(verbose=False, indent=4)
¶
Serialize a Pydantic model to a YAML string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
verbose
|
bool
|
Whether to include verbose comments in the YAML output. |
False
|
indent
|
int
|
The per-level indentation to use. |
4
|
Source code in aiperf/common/config/base_config.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
aiperf.common.config.config_defaults¶
aiperf.common.config.config_validators¶
parse_file(value)
¶
Parses the given string value and returns a Path object if the value represents a valid file or directory. Returns None if the input value is empty. Args: value (str): The string value to parse. Returns: Optional[Path]: A Path object if the value is valid, or None if the value is empty. Raises: ValueError: If the value is not a valid file or directory.
Source code in aiperf/common/config/config_validators.py
206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 | |
parse_goodput(goodputs)
¶
Parses and validates a dictionary of goodput values, ensuring that all values are non-negative integers or floats, and converts them to floats. Args: goodputs (Dict[str, Any]): A dictionary where keys are target metric names (strings) and values are the corresponding goodput values. Returns: Dict[str, float]: A dictionary with the same keys as the input, but with all values converted to floats. Raises: ValueError: If any value in the input dictionary is not an integer or float, or if any value is negative.
Source code in aiperf/common/config/config_validators.py
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 | |
parse_service_types(input)
¶
Parses the input to ensure it is a set of service types. Will replace hyphens with underscores for user convenience.
Source code in aiperf/common/config/config_validators.py
73 74 75 76 77 78 79 80 81 82 | |
parse_str_or_csv_list(input)
¶
Parses the input to ensure it is either a string or a list. If the input is a string, it splits the string by commas and trims any whitespace around each element, returning the result as a list. If the input is already a list, it will split each item by commas and trim any whitespace around each element, returning the combined result as a list. If the input is neither a string nor a list, a ValueError is raised.
[1, 2, 3] -> [1, 2, 3] "1,2,3" -> ["1", "2", "3"]["1,2,3", "4,5,6"] -> ["1", "2", "3", "4", "5", "6"]["1,2,3", 4, 5] -> ["1", "2", "3", 4, 5]
Source code in aiperf/common/config/config_validators.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | |
parse_str_or_dict(input)
¶
Parses the input to ensure it is a dictionary.
- If the input is a string:
- If the string starts with a '{', it is parsed as a JSON string.
- Otherwise, it splits the string by commas and then for each item, it splits the item by colons into key and value, trims any whitespace.
- If the input is already a dictionary, it is returned as-is.
- If the input is a list, it is converted to a dictionary by splitting each string by colons into key and value, trims any whitespace.
- Otherwise, a ValueError is raised.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input
|
Any
|
The input to be parsed. Expected to be a string, list, or dictionary. |
required |
Returns: dict[str, Any]: A dictionary derived from the input. Raises: ValueError: If the input is neither a string, list, nor dictionary, or if the parsing fails.
Source code in aiperf/common/config/config_validators.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | |
parse_str_or_list(input)
¶
Parses the input to ensure it is either a string or a list. If the input is a string, it splits the string by commas and trims any whitespace around each element, returning the result as a list. If the input is already a list, it is returned as-is. If the input is neither a string nor a list, a ValueError is raised. Args: input (Any): The input to be parsed. Expected to be a string or a list. Returns: list: A list of strings derived from the input. Raises: ValueError: If the input is neither a string nor a list.
Source code in aiperf/common/config/config_validators.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
parse_str_or_list_of_positive_values(input)
¶
Parses the input to ensure it is a list of positive integers or floats.
This function first converts the input into a list using parse_str_or_list.
It then validates that each value in the list is either an integer or a float
and that all values are strictly greater than zero. If any value fails this
validation, a ValueError is raised.
Args:
input (Any): The input to be parsed. It can be a string or a list.
Returns:
List[Any]: A list of positive integers or floats.
Raises:
ValueError: If any value in the parsed list is not a positive integer or float.
Source code in aiperf/common/config/config_validators.py
145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 | |
aiperf.common.config.conversation_config¶
ConversationConfig
¶
Bases: BaseConfig
A configuration class for defining conversations related settings.
Source code in aiperf/common/config/conversation_config.py
112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 | |
TurnConfig
¶
Bases: BaseConfig
A configuration class for defining turn related settings in a conversation.
Source code in aiperf/common/config/conversation_config.py
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 | |
TurnDelayConfig
¶
Bases: BaseConfig
A configuration class for defining turn delay related settings.
Source code in aiperf/common/config/conversation_config.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | |
aiperf.common.config.endpoint_config¶
EndpointConfig
¶
Bases: BaseConfig
A configuration class for defining endpoint related settings.
Source code in aiperf/common/config/endpoint_config.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | |
aiperf.common.config.groups¶
Groups
¶
Groups for the CLI.
NOTE: The order of these groups are the order they will be displayed in the help text.
Source code in aiperf/common/config/groups.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | |
aiperf.common.config.image_config¶
ImageConfig
¶
Bases: BaseConfig
A configuration class for defining image related settings.
Source code in aiperf/common/config/image_config.py
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | |
ImageHeightConfig
¶
Bases: BaseConfig
A configuration class for defining image height related settings.
Source code in aiperf/common/config/image_config.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
ImageWidthConfig
¶
Bases: BaseConfig
A configuration class for defining image width related settings.
Source code in aiperf/common/config/image_config.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
aiperf.common.config.input_config¶
InputConfig
¶
Bases: BaseConfig
A configuration class for defining input related settings.
Source code in aiperf/common/config/input_config.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | |
validate_fixed_schedule()
¶
Validate the fixed schedule configuration.
Source code in aiperf/common/config/input_config.py
35 36 37 38 39 40 41 42 43 | |
aiperf.common.config.loader¶
load_service_config()
¶
Load the service configuration.
Source code in aiperf/common/config/loader.py
7 8 9 10 | |
load_user_config()
¶
Load the user configuration.
Source code in aiperf/common/config/loader.py
13 14 15 16 | |
aiperf.common.config.loadgen_config¶
LoadGeneratorConfig
¶
Bases: BaseConfig
A configuration class for defining top-level load generator settings.
Source code in aiperf/common/config/loadgen_config.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 | |
aiperf.common.config.measurement_config¶
MeasurementConfig
¶
Bases: BaseConfig
A configuration class for defining top-level measurement settings.
Source code in aiperf/common/config/measurement_config.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | |
aiperf.common.config.output_config¶
OutputConfig
¶
Bases: BaseConfig
A configuration class for defining output related settings.
Source code in aiperf/common/config/output_config.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | |
aiperf.common.config.prompt_config¶
InputTokensConfig
¶
Bases: BaseConfig
A configuration class for defining input token related settings.
Source code in aiperf/common/config/prompt_config.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | |
OutputTokensConfig
¶
Bases: BaseConfig
A configuration class for defining output token related settings.
Source code in aiperf/common/config/prompt_config.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | |
PrefixPromptConfig
¶
Bases: BaseConfig
A configuration class for defining prefix prompt related settings.
Source code in aiperf/common/config/prompt_config.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
PromptConfig
¶
Bases: BaseConfig
A configuration class for defining prompt related settings.
Source code in aiperf/common/config/prompt_config.py
183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 | |
aiperf.common.config.service_config¶
ServiceConfig
¶
Bases: BaseSettings
Base configuration for all services. It will be provided to all services during their init function.
Source code in aiperf/common/config/service_config.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 | |
validate_comm_config()
¶
Initialize the comm_config if it is not provided, based on the comm_backend.
Source code in aiperf/common/config/service_config.py
49 50 51 52 53 54 55 56 57 58 59 | |
validate_log_level_from_verbose_flags()
¶
Set log level based on verbose flags.
Source code in aiperf/common/config/service_config.py
40 41 42 43 44 45 46 47 | |
aiperf.common.config.sweep_config¶
SweepConfig
¶
Bases: BaseConfig
A sweep of parameters.
Source code in aiperf/common/config/sweep_config.py
99 100 | |
SweepParam
¶
Bases: BaseConfig
A parameter to be swept.
Source code in aiperf/common/config/sweep_config.py
8 9 | |
aiperf.common.config.tokenizer_config¶
TokenizerConfig
¶
Bases: BaseConfig
A configuration class for defining tokenizer related settings.
Source code in aiperf/common/config/tokenizer_config.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | |
aiperf.common.config.user_config¶
UserConfig
¶
Bases: BaseConfig
A configuration class for defining top-level user settings.
Source code in aiperf/common/config/user_config.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
aiperf.common.config.worker_config¶
WorkersConfig
¶
Bases: BaseConfig
Worker configuration.
Source code in aiperf/common/config/worker_config.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
aiperf.common.config.zmq_config¶
BaseZMQCommunicationConfig
¶
Bases: BaseModel, ABC
Configuration for ZMQ communication.
Source code in aiperf/common/config/zmq_config.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
credit_drop_address
abstractmethod
property
¶
Get the credit drop address based on protocol configuration.
credit_return_address
abstractmethod
property
¶
Get the credit return address based on protocol configuration.
records_push_pull_address
abstractmethod
property
¶
Get the inference push/pull address based on protocol configuration.
get_address(address_type)
¶
Get the actual address based on the address type.
Source code in aiperf/common/config/zmq_config.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
BaseZMQProxyConfig
¶
Bases: BaseModel, ABC
Configuration Protocol for ZMQ Proxy.
Source code in aiperf/common/config/zmq_config.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | |
backend_address
abstractmethod
property
¶
Get the backend address based on protocol configuration.
capture_address
abstractmethod
property
¶
Get the capture address based on protocol configuration.
control_address
abstractmethod
property
¶
Get the control address based on protocol configuration.
frontend_address
abstractmethod
property
¶
Get the frontend address based on protocol configuration.
ZMQIPCConfig
¶
Bases: BaseZMQCommunicationConfig
Configuration for IPC transport.
Source code in aiperf/common/config/zmq_config.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 | |
ZMQIPCProxyConfig
¶
Bases: BaseZMQProxyConfig
Configuration for IPC proxy.
Source code in aiperf/common/config/zmq_config.py
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | |
backend_address
property
¶
Get the backend address based on protocol configuration.
capture_address
property
¶
Get the capture address based on protocol configuration.
control_address
property
¶
Get the control address based on protocol configuration.
frontend_address
property
¶
Get the frontend address based on protocol configuration.
ZMQTCPConfig
¶
Bases: BaseZMQCommunicationConfig
Configuration for TCP transport.
Source code in aiperf/common/config/zmq_config.py
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 | |
ZMQTCPProxyConfig
¶
Bases: BaseZMQProxyConfig
Configuration for TCP proxy.
Source code in aiperf/common/config/zmq_config.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | |
backend_address
property
¶
Get the backend address based on protocol configuration.
capture_address
property
¶
Get the capture address based on protocol configuration.
control_address
property
¶
Get the control address based on protocol configuration.
frontend_address
property
¶
Get the frontend address based on protocol configuration.
aiperf.common.constants¶
DEFAULT_COMMS_REQUEST_TIMEOUT = 10.0
module-attribute
¶
Default timeout for requests from req_clients to rep_clients in seconds.
DEFAULT_PULL_CLIENT_MAX_CONCURRENCY = 100000
module-attribute
¶
Default maximum concurrency for pull clients.
DEFAULT_SERVICE_REGISTRATION_TIMEOUT = 5.0
module-attribute
¶
Default timeout for service registration in seconds.
DEFAULT_SERVICE_START_TIMEOUT = 5.0
module-attribute
¶
Default timeout for service start in seconds.
DEFAULT_STREAMING_MAX_QUEUE_SIZE = 100000
module-attribute
¶
Default maximum queue size for streaming post processors.
TASK_CANCEL_TIMEOUT_LONG = 5.0
module-attribute
¶
Maximum time to wait for complex tasks to complete when cancelling them (like parent tasks).
TASK_CANCEL_TIMEOUT_SHORT = 2.0
module-attribute
¶
Maximum time to wait for simple tasks to complete when cancelling them.
aiperf.common.decorators¶
Decorators for AIPerf components. Note that these are not the same as hooks. Hooks are used to specify that a function should be called at a specific time, while decorators are used to specify that a class or function should be treated a specific way.
see also: :mod:aiperf.common.hooks for hook decorators.
DecoratorAttrs
¶
Constant attribute names for decorators.
When you decorate a class with a decorator, the decorator type and parameters are set as attributes on the class.
Source code in aiperf/common/decorators.py
17 18 19 20 21 22 23 24 | |
implements_protocol(protocol)
¶
Decorator to specify that the class implements the given protocol.
Example:
@implements_protocol(ServiceProtocol)
class BaseService:
pass
The above is the equivalent to setting:
BaseService.__implements_protocol__ = ServiceProtocol
Source code in aiperf/common/decorators.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | |
aiperf.common.enums.base_enums¶
CaseInsensitiveStrEnum
¶
Bases: str, Enum
CaseInsensitiveStrEnum is a custom enumeration class that extends str and Enum to provide case-insensitive
lookup functionality for its members.
Source code in aiperf/common/enums/base_enums.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | |
aiperf.common.enums.benchmark_suite_enums¶
BenchmarkSuiteCompletionTrigger
¶
Bases: CaseInsensitiveStrEnum
Determines how the suite completion is determined in order to know how to track the progress.
Source code in aiperf/common/enums/benchmark_suite_enums.py
7 8 9 10 11 | |
COMPLETED_PROFILES = 'completed_profiles'
class-attribute
instance-attribute
¶
The suite will run until all profiles are completed.
BenchmarkSuiteType
¶
Bases: CaseInsensitiveStrEnum
Determines the type of suite to know how to track the progress.
Source code in aiperf/common/enums/benchmark_suite_enums.py
19 20 21 22 23 | |
SINGLE_PROFILE = 'single_profile'
class-attribute
instance-attribute
¶
A suite with a single profile run.
aiperf.common.enums.command_enums¶
aiperf.common.enums.communication_enums¶
CommAddress
¶
Bases: CaseInsensitiveStrEnum
Enum for specifying the address type for communication clients. This is used to lookup the address in the communication config.
Source code in aiperf/common/enums/communication_enums.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
CREDIT_DROP = 'credit_drop'
class-attribute
instance-attribute
¶
Address to send CreditDrop messages from the TimingManager to the Worker.
CREDIT_RETURN = 'credit_return'
class-attribute
instance-attribute
¶
Address to send CreditReturn messages from the Worker to the TimingManager.
DATASET_MANAGER_PROXY_BACKEND = 'dataset_manager_proxy_backend'
class-attribute
instance-attribute
¶
Backend address for the DatasetManager to receive requests from clients.
DATASET_MANAGER_PROXY_FRONTEND = 'dataset_manager_proxy_frontend'
class-attribute
instance-attribute
¶
Frontend address for sending requests to the DatasetManager.
EVENT_BUS_PROXY_BACKEND = 'event_bus_proxy_backend'
class-attribute
instance-attribute
¶
Backend address for services to subscribe to messages.
EVENT_BUS_PROXY_FRONTEND = 'event_bus_proxy_frontend'
class-attribute
instance-attribute
¶
Frontend address for services to publish messages to.
RAW_INFERENCE_PROXY_BACKEND = 'raw_inference_proxy_backend'
class-attribute
instance-attribute
¶
Backend address for the InferenceParser to receive raw inference messages from Workers.
RAW_INFERENCE_PROXY_FRONTEND = 'raw_inference_proxy_frontend'
class-attribute
instance-attribute
¶
Frontend address for sending raw inference messages to the InferenceParser from Workers.
RECORDS = 'records'
class-attribute
instance-attribute
¶
Address to send parsed records from InferenceParser to RecordManager.
aiperf.common.enums.data_exporter_enums¶
aiperf.common.enums.dataset_enums¶
aiperf.common.enums.endpoints_enums¶
EndpointType
¶
Bases: CaseInsensitiveStrEnum
Endpoint types.
These determine the format of request payload to send to the model.
Similar to endpoint_type_map and OutputFormat from genai-perf.
Source code in aiperf/common/enums/endpoints_enums.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | |
endpoint_path()
¶
Get the endpoint path for the endpoint type.
Source code in aiperf/common/enums/endpoints_enums.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | |
metrics_title()
¶
Get the title string for the endpoint type.
Source code in aiperf/common/enums/endpoints_enums.py
67 68 69 70 71 72 73 74 75 | |
response_payload_type()
¶
Get the response payload type for the request payload type.
Source code in aiperf/common/enums/endpoints_enums.py
77 78 79 | |
ResponsePayloadType
¶
Bases: CaseInsensitiveStrEnum
Response payload types.
These determine the format of the response payload that the model will return.
Equivalent to output_format from genai-perf.
Source code in aiperf/common/enums/endpoints_enums.py
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | |
from_endpoint_type(endpoint_type)
classmethod
¶
Get the response payload type for the endpoint type.
Source code in aiperf/common/enums/endpoints_enums.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 | |
aiperf.common.enums.logging_enums¶
aiperf.common.enums.measurement_enums¶
aiperf.common.enums.message_enums¶
MessageType
¶
Bases: CaseInsensitiveStrEnum
The various types of messages that can be sent between services.
The message type is used to determine what Pydantic model the message maps to,
based on the message_type field in the message model. For detailed explanations
of each message type, go to its definition in :mod:aiperf.common.messages.
Source code in aiperf/common/enums/message_enums.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
NotificationType
¶
Bases: CaseInsensitiveStrEnum
Types of notifications that can be sent to other services.
Source code in aiperf/common/enums/message_enums.py
57 58 59 60 61 | |
DATASET_CONFIGURED = 'dataset_configured'
class-attribute
instance-attribute
¶
A notification sent to notify other services that the dataset has been configured.
aiperf.common.enums.metric_enums¶
MetricTimeType
¶
Bases: CaseInsensitiveStrEnum
Defines the time types for metrics.
Source code in aiperf/common/enums/metric_enums.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | |
short_name()
¶
Get the short name for the time type.
Source code in aiperf/common/enums/metric_enums.py
16 17 18 19 20 21 22 23 | |
aiperf.common.enums.model_enums¶
Modality
¶
Bases: CaseInsensitiveStrEnum
Modality of the model. Can be used to determine the type of data to send to the model in conjunction with the ModelSelectionStrategy.MODALITY_AWARE.
Source code in aiperf/common/enums/model_enums.py
7 8 9 10 11 12 13 14 15 16 | |
ModelSelectionStrategy
¶
Bases: CaseInsensitiveStrEnum
Strategy for selecting the model to use for the request.
Source code in aiperf/common/enums/model_enums.py
19 20 21 22 23 24 | |
aiperf.common.enums.post_processor_enums¶
StreamingPostProcessorType
¶
Bases: CaseInsensitiveStrEnum
Type of response streamer.
Source code in aiperf/common/enums/post_processor_enums.py
11 12 13 14 15 16 17 18 19 20 21 | |
BASIC_METRICS = 'basic_metrics'
class-attribute
instance-attribute
¶
Streamer that handles the basic metrics of the records.
JSONL = 'jsonl'
class-attribute
instance-attribute
¶
Streams all parsed records to a JSONL file.
PROCESSING_STATS = 'processing_stats'
class-attribute
instance-attribute
¶
Streamer that provides the processing stats of the records.
aiperf.common.enums.service_enums¶
LifecycleState
¶
Bases: CaseInsensitiveStrEnum
This is the various states a lifecycle can be in.
Source code in aiperf/common/enums/service_enums.py
19 20 21 22 23 24 25 26 27 28 29 | |
ServiceRegistrationStatus
¶
Bases: CaseInsensitiveStrEnum
Defines the various states a service can be in during registration with the SystemController.
Source code in aiperf/common/enums/service_enums.py
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | |
ERROR = 'error'
class-attribute
instance-attribute
¶
The service registration failed.
REGISTERED = 'registered'
class-attribute
instance-attribute
¶
The service is registered with the SystemController.
TIMEOUT = 'timeout'
class-attribute
instance-attribute
¶
The service registration timed out.
UNREGISTERED = 'unregistered'
class-attribute
instance-attribute
¶
The service is not registered with the SystemController. This is the initial state.
WAITING = 'waiting'
class-attribute
instance-attribute
¶
The service is waiting for the SystemController to register it. This is a temporary state that should be followed by REGISTERED, TIMEOUT, or ERROR.
ServiceRunType
¶
Bases: CaseInsensitiveStrEnum
The different ways the SystemController should run the component services.
Source code in aiperf/common/enums/service_enums.py
7 8 9 10 11 12 13 14 15 16 | |
KUBERNETES = 'k8s'
class-attribute
instance-attribute
¶
Run each service as a separate Kubernetes pod. This is the default way for multi-node deployments.
MULTIPROCESSING = 'process'
class-attribute
instance-attribute
¶
Run each service as a separate process. This is the default way for single-node deployments.
ServiceType
¶
Bases: CaseInsensitiveStrEnum
Types of services in the AIPerf system.
This is used to identify the service type when registering with the SystemController. It can also be used for tracking purposes if multiple instances of the same service type are running.
Source code in aiperf/common/enums/service_enums.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
aiperf.common.enums.sse_enums¶
SSEEventType
¶
Bases: CaseInsensitiveStrEnum
Event types in an SSE message. Many of these are custom and not defined by the SSE spec.
Source code in aiperf/common/enums/sse_enums.py
17 18 19 20 21 | |
SSEFieldType
¶
Bases: CaseInsensitiveStrEnum
Field types in an SSE message.
Source code in aiperf/common/enums/sse_enums.py
7 8 9 10 11 12 13 14 | |
aiperf.common.enums.system_enums¶
SystemState
¶
Bases: CaseInsensitiveStrEnum
State of the system as a whole.
This is used to track the state of the system as a whole, and is used to determine what actions to take when a signal is received.
Source code in aiperf/common/enums/system_enums.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | |
CONFIGURING = 'configuring'
class-attribute
instance-attribute
¶
The system is configuring services.
INITIALIZING = 'initializing'
class-attribute
instance-attribute
¶
The system is initializing. This is the initial state.
PROCESSING = 'processing'
class-attribute
instance-attribute
¶
The system is processing results.
PROFILING = 'profiling'
class-attribute
instance-attribute
¶
The system is running a profiling run.
READY = 'ready'
class-attribute
instance-attribute
¶
The system is ready to start profiling. This is a temporary state that should be followed by PROFILING.
SHUTDOWN = 'shutdown'
class-attribute
instance-attribute
¶
The system is shutting down. This is the final state.
STOPPING = 'stopping'
class-attribute
instance-attribute
¶
The system is stopping.
aiperf.common.enums.timing_enums¶
CreditPhase
¶
Bases: CaseInsensitiveStrEnum
The type of credit phase. This is used to identify which phase of the benchmark the credit is being used in, for tracking and reporting purposes.
Source code in aiperf/common/enums/timing_enums.py
30 31 32 33 34 35 36 37 38 39 40 | |
PROFILING = 'profiling'
class-attribute
instance-attribute
¶
The credit phase is the steady state phase. This is the primary phase of the benchmark, and what is used to calculate the final results.
WARMUP = 'warmup'
class-attribute
instance-attribute
¶
The credit phase is the warmup phase. This is used to warm up the model before the benchmark starts.
RequestRateMode
¶
Bases: CaseInsensitiveStrEnum
The different ways the RequestRateStrategy should generate requests.
Source code in aiperf/common/enums/timing_enums.py
20 21 22 23 24 25 26 27 | |
TimingMode
¶
Bases: CaseInsensitiveStrEnum
The different ways the TimingManager should generate requests.
Source code in aiperf/common/enums/timing_enums.py
7 8 9 10 11 12 13 14 15 16 17 | |
CONCURRENCY = 'concurrency'
class-attribute
instance-attribute
¶
A mode where the TimingManager will maintain a continuous stream of concurrent requests.
FIXED_SCHEDULE = 'fixed_schedule'
class-attribute
instance-attribute
¶
A mode where the TimingManager will send requests according to a fixed schedule.
REQUEST_RATE = 'request_rate'
class-attribute
instance-attribute
¶
A mode where the TimingManager will send requests at either a constant request rate or based on a poisson distribution.
aiperf.common.exceptions¶
AIPerfError
¶
Bases: Exception
Base class for all exceptions raised by AIPerf.
Source code in aiperf/common/exceptions.py
7 8 9 10 11 12 13 14 15 16 | |
__str__()
¶
Return the string representation of the exception with the class name.
Source code in aiperf/common/exceptions.py
14 15 16 | |
raw_str()
¶
Return the raw string representation of the exception.
Source code in aiperf/common/exceptions.py
10 11 12 | |
AIPerfMultiError
¶
Bases: AIPerfError
Exception raised when running multiple tasks and one or more fail.
Source code in aiperf/common/exceptions.py
19 20 21 22 23 24 25 26 27 | |
CommunicationError
¶
Bases: AIPerfError
Generic communication error.
Source code in aiperf/common/exceptions.py
46 47 | |
ConfigurationError
¶
Bases: AIPerfError
Exception raised when something fails to configure, or there is a configuration error.
Source code in aiperf/common/exceptions.py
50 51 | |
DatasetError
¶
Bases: AIPerfError
Generic dataset error.
Source code in aiperf/common/exceptions.py
54 55 | |
DatasetGeneratorError
¶
Bases: AIPerfError
Generic dataset generator error.
Source code in aiperf/common/exceptions.py
58 59 | |
FactoryCreationError
¶
Bases: AIPerfError
Exception raised when a factory encounters an error while creating a class.
Source code in aiperf/common/exceptions.py
62 63 | |
InferenceClientError
¶
Bases: AIPerfError
Exception raised when a inference client encounters an error.
Source code in aiperf/common/exceptions.py
70 71 | |
InitializationError
¶
Bases: AIPerfError
Exception raised when something fails to initialize.
Source code in aiperf/common/exceptions.py
66 67 | |
InvalidOperationError
¶
Bases: AIPerfError
Exception raised when an operation is invalid.
Source code in aiperf/common/exceptions.py
74 75 | |
InvalidPayloadError
¶
Bases: InferenceClientError
Exception raised when a inference client receives an invalid payload.
Source code in aiperf/common/exceptions.py
78 79 | |
InvalidStateError
¶
Bases: AIPerfError
Exception raised when something is in an invalid state.
Source code in aiperf/common/exceptions.py
82 83 | |
MetricTypeError
¶
Bases: AIPerfError
Exception raised when a metric type encounters an error while creating a class.
Source code in aiperf/common/exceptions.py
86 87 | |
NotFoundError
¶
Bases: AIPerfError
Exception raised when something is not found or not available.
Source code in aiperf/common/exceptions.py
90 91 | |
NotInitializedError
¶
Bases: AIPerfError
Exception raised when something that should be initialized is not.
Source code in aiperf/common/exceptions.py
94 95 | |
ProxyError
¶
Bases: AIPerfError
Exception raised when a proxy encounters an error.
Source code in aiperf/common/exceptions.py
98 99 | |
ServiceError
¶
Bases: AIPerfError
Generic service error.
Source code in aiperf/common/exceptions.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 | |
ShutdownError
¶
Bases: AIPerfError
Exception raised when a service encounters an error while shutting down.
Source code in aiperf/common/exceptions.py
102 103 | |
UnsupportedHookError
¶
Bases: AIPerfError
Exception raised when a hook is defined on a class that does not have any base classes that provide that hook type.
Source code in aiperf/common/exceptions.py
106 107 | |
ValidationError
¶
Bases: AIPerfError
Exception raised when something fails validation.
Source code in aiperf/common/exceptions.py
110 111 | |
aiperf.common.factories¶
AIPerfFactory
¶
Bases: Generic[ClassEnumT, ClassProtocolT]
Defines a custom factory for AIPerf components.
This class is used to create a factory for a given class type and protocol.
Example:
# Define a new enum for the expected implementation types
# This is optional, but recommended for type safety.
class DatasetLoaderType(CaseInsensitiveStrEnum):
FILE = "file"
S3 = "s3"
# Define a new class protocol.
class DatasetLoaderProtocol(Protocol):
def load(self) -> Dataset:
pass
# Create a new factory for a given class type and protocol.
class DatasetFactory(FactoryMixin[DatasetLoaderType, DatasetLoaderProtocol]):
pass
# Register a new class type mapping to its corresponding class. It should implement the class protocol.
@DatasetFactory.register(DatasetLoaderType.FILE)
class FileDatasetLoader:
def __init__(self, filename: str):
self.filename = filename
def load(self) -> Dataset:
return Dataset.from_file(self.filename)
DatasetConfig = {
"type": DatasetLoaderType.FILE,
"filename": "data.csv"
}
# Create a new instance of the class.
if DatasetConfig["type"] == DatasetLoaderType.FILE:
dataset_instance = DatasetFactory.create_instance(DatasetLoaderType.FILE, filename=DatasetConfig["filename"])
else:
raise ValueError(f"Unsupported dataset loader type: {DatasetConfig['type']}")
dataset_instance.load()
Source code in aiperf/common/factories.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 | |
create_instance(class_type, **kwargs)
classmethod
¶
Create a new class instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
class_type
|
ClassEnumT | str
|
The type of class to create |
required |
**kwargs
|
Any
|
Additional arguments for the class |
{}
|
Returns:
| Type | Description |
|---|---|
ClassProtocolT
|
The created class instance |
Raises:
| Type | Description |
|---|---|
FactoryCreationError
|
If the class type is not registered or there is an error creating the instance |
Source code in aiperf/common/factories.py
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 | |
get_all_class_types()
classmethod
¶
Get all registered class types.
Source code in aiperf/common/factories.py
234 235 236 237 | |
get_all_classes()
classmethod
¶
Get all registered classes.
Returns:
| Type | Description |
|---|---|
list[type[ClassProtocolT]]
|
A list of all registered class types implementing the expected protocol |
Source code in aiperf/common/factories.py
225 226 227 228 229 230 231 232 | |
get_all_classes_and_types()
classmethod
¶
Get all registered classes and their corresponding class types.
Source code in aiperf/common/factories.py
239 240 241 242 243 244 | |
get_class_from_type(class_type)
classmethod
¶
Get the class from a class type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
class_type
|
ClassEnumT | str
|
The class type to get the class from |
required |
Returns:
| Type | Description |
|---|---|
type[ClassProtocolT]
|
The class for the given class type |
Raises:
| Type | Description |
|---|---|
TypeError
|
If the class type is not registered |
Source code in aiperf/common/factories.py
206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 | |
register(class_type, override_priority=0)
classmethod
¶
Register a new class type mapping to its corresponding class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
class_type
|
ClassEnumT | str
|
The type of class to register |
required |
override_priority
|
int
|
The priority of the override. The higher the priority, the more precedence the override has when multiple classes are registered for the same class type. Built-in classes have a priority of 0. |
0
|
Returns:
| Type | Description |
|---|---|
Callable
|
Decorator for the class that implements the class protocol |
Source code in aiperf/common/factories.py
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 | |
register_all(*class_types, override_priority=0)
classmethod
¶
Register multiple class types mapping to a single corresponding class. This is useful if a single class implements multiple types. Currently only supports registering as a single override priority for all types.
Source code in aiperf/common/factories.py
121 122 123 124 125 126 127 128 129 130 131 132 133 134 | |
AIPerfSingletonFactory
¶
Bases: AIPerfFactory[ClassEnumT, ClassProtocolT]
Factory for registering and creating singleton instances of a given class type and protocol.
This factory is useful for creating instances that are shared across the application, such as communication clients.
Calling create_instance will create a new instance if it doesn't exist, otherwise it will return the existing instance.
Calling get_instance will return the existing instance if it exists, otherwise it will raise an error.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 | |
create_instance(class_type, **kwargs)
classmethod
¶
Create a new instance of the given class type. If the instance does not exist, or the process ID has changed, a new instance will be created.
Source code in aiperf/common/factories.py
278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 | |
get_or_create_instance(class_type, **kwargs)
classmethod
¶
Syntactic sugar for create_instance, but with a more descriptive name for singleton factories.
Source code in aiperf/common/factories.py
271 272 273 274 275 276 | |
CommunicationClientFactory
¶
Bases: AIPerfFactory[CommClientType, 'CommunicationClientProtocol']
Factory for registering and creating CommunicationClientProtocol instances based on the specified communication client type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 | |
CommunicationFactory
¶
Bases: AIPerfSingletonFactory[CommunicationBackend, 'CommunicationProtocol']
Factory for registering and creating CommunicationProtocol instances based on the specified communication backend.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 | |
ComposerFactory
¶
Bases: AIPerfFactory[ComposerType, 'BaseDatasetComposer']
Factory for registering and creating BaseDatasetComposer instances based on the specified composer type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
373 374 375 376 377 378 379 380 381 382 383 384 | |
CustomDatasetFactory
¶
Bases: AIPerfFactory[CustomDatasetType, 'CustomDatasetLoaderProtocol']
Factory for registering and creating CustomDatasetLoaderProtocol instances based on the specified custom dataset type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
387 388 389 390 391 392 393 394 395 396 397 398 399 400 | |
DataExporterFactory
¶
Bases: AIPerfFactory[DataExporterType, 'DataExporterProtocol']
Factory for registering and creating DataExporterProtocol instances based on the specified data exporter type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 | |
InferenceClientFactory
¶
Bases: AIPerfFactory[EndpointType, 'InferenceClientProtocol']
Factory for registering and creating InferenceClientProtocol instances based on the specified endpoint type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 | |
PostProcessorFactory
¶
Bases: AIPerfFactory[PostProcessorType, 'PostProcessorProtocol']
Factory for registering and creating PostProcessorProtocol instances based on the specified post processor type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
437 438 439 440 441 442 443 444 445 446 447 448 | |
RequestConverterFactory
¶
Bases: AIPerfSingletonFactory[EndpointType, 'RequestConverterProtocol']
Factory for registering and creating RequestConverterProtocol instances based on the specified request payload type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
451 452 453 454 455 456 | |
ResponseExtractorFactory
¶
Bases: AIPerfFactory[EndpointType, 'ResponseExtractorProtocol']
Factory for registering and creating ResponseExtractorProtocol instances based on the specified response extractor type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 | |
ServiceFactory
¶
Bases: AIPerfFactory[ServiceType, 'ServiceProtocol']
Factory for registering and creating ServiceProtocol instances based on the specified service type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 | |
ServiceManagerFactory
¶
Bases: AIPerfFactory[ServiceRunType, 'ServiceManagerProtocol']
Factory for registering and creating ServiceManagerProtocol instances based on the specified service run type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 | |
StreamingPostProcessorFactory
¶
Bases: AIPerfFactory[StreamingPostProcessorType, 'StreamingPostProcessorProtocol']
Factory for registering and creating StreamingPostProcessorProtocol instances based on the specified streaming post processor type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 | |
ZMQProxyFactory
¶
Bases: AIPerfFactory[ZMQProxyType, 'BaseZMQProxy']
Factory for registering and creating BaseZMQProxy instances based on the specified ZMQ proxy type.
see: :class:aiperf.common.factories.AIPerfFactory for more details.
Source code in aiperf/common/factories.py
556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 | |
aiperf.common.hooks¶
This module provides an extensive set of hook definitions for AIPerf. It is designed to be
used in conjunction with the :class:HooksMixin for classes to provide support for hooks.
It provides a simple interface for registering hooks.
Classes should inherit from the :class:HooksMixin, and specify the provided
hook types by decorating the class with the :func:provides_hooks decorator.
The hook functions are registered by decorating functions with the various hook
decorators such as :func:on_init, :func:on_start, :func:on_stop, etc.
More than one hook can be registered for a given hook type, and classes that inherit from classes with existing hooks will inherit the hooks from the base classes as well.
The hooks are run by calling the :meth:HooksMixin.run_hooks method or retrieved via the
:meth:HooksMixin.get_hooks method on the class.
HookType = AIPerfHook | str
module-attribute
¶
Type alias for valid hook types. This is a union of the AIPerfHook enum and any user-defined custom strings.
Hook
¶
Bases: BaseModel, Generic[HookParamsT]
A hook is a function that is decorated with a hook type and optional parameters. The HookParamsT is the type of the parameters. You can either have a static value, or a callable that returns the parameters.
Source code in aiperf/common/hooks.py
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |
resolve_params(self_obj)
¶
Resolve the parameters for the hook. If the parameters are a callable, it will be called with the self_obj as the argument, otherwise the parameters are returned as is.
Source code in aiperf/common/hooks.py
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | |
HookAttrs
¶
Constant attribute names for hooks.
When you decorate a function with a hook decorator, the hook type and parameters are set as attributes on the function or class.
Source code in aiperf/common/hooks.py
57 58 59 60 61 62 63 64 65 66 | |
background_task(interval=None, immediate=True, stop_on_error=False)
¶
Decorator to mark a method as a background task with automatic management.
Tasks are automatically started when the service starts and stopped when the service stops. The decorated method will be run periodically in the background when the service is running.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
interval
|
float | Callable[[SelfT], float] | None
|
Time between task executions in seconds. If None, the task will run once. Can be a callable that returns the interval, and will be called with 'self' as the argument. |
None
|
immediate
|
bool
|
If True, run the task immediately on start, otherwise wait for the interval first. |
True
|
stop_on_error
|
bool
|
If True, stop the task on any exception, otherwise log and continue. |
False
|
Example:
class MyPlugin(AIPerfLifecycleMixin):
@background_task(interval=1.0)
def _background_task(self) -> None:
pass
The above is the equivalent to setting:
MyPlugin._background_task.__aiperf_hook_type__ = AIPerfHook.BACKGROUND_TASK
MyPlugin._background_task.__aiperf_hook_params__ = BackgroundTaskParams(
interval=1.0, immediate=True, stop_on_error=False
)
Source code in aiperf/common/hooks.py
158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 | |
on_command(*command_types)
¶
Decorator to specify that the function is a hook that should be called when a CommandMessage with the given
command type(s) is received from the message bus.
See :func:aiperf.common.hooks._hook_decorator_for_message_types.
Example:
class MyService(BaseComponentService):
@on_command(CommandType.PROFILE_START)
def _on_profile_start(self, message: ProfileStartCommand) -> CommandResponse:
pass
The above is the equivalent to setting:
MyService._on_profile_start.__aiperf_hook_type__ = AIPerfHook.ON_COMMAND
MyService._on_profile_start.__aiperf_hook_params__ = (CommandType.PROFILE_START,)
Source code in aiperf/common/hooks.py
381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 | |
on_init(func)
¶
Decorator to specify that the function is a hook that should be called during initialization.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(AIPerfLifecycleMixin):
@on_init
def _init_plugin(self) -> None:
pass
The above is the equivalent to setting:
MyPlugin._init_plugin.__aiperf_hook_type__ = AIPerfHook.ON_INIT
Source code in aiperf/common/hooks.py
224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 | |
on_message(*message_types)
¶
Decorator to specify that the function is a hook that should be called when messages of the
given type(s) are received from the message bus.
See :func:aiperf.common.hooks._hook_decorator_with_params.
Example:
class MyService(MessageBusClientMixin):
@on_message(MessageType.STATUS)
def _on_status_message(self, message: StatusMessage) -> None:
pass
The above is the equivalent to setting:
MyService._on_status_message.__aiperf_hook_type__ = AIPerfHook.ON_MESSAGE
MyService._on_status_message.__aiperf_hook_params__ = (MessageType.STATUS,)
Source code in aiperf/common/hooks.py
306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 | |
on_pull_message(*message_types)
¶
Decorator to specify that the function is a hook that should be called a pull client
receives a message of the given type(s).
See :func:aiperf.common.hooks._hook_decorator_for_message_types.
Example:
class MyService(PullClientMixin, BaseComponentService):
@on_pull_message(MessageType.CREDIT_DROP)
def _on_credit_drop_pull(self, message: CreditDropMessage) -> None:
pass
The above is the equivalent to setting: ```python MyService._on_pull_message.aiperf_hook_type = AIPerfHook.ON_PULL_MESSAGE MyService._on_pull_message.aiperf_hook_params = (MessageType.CREDIT_DROP,)
Source code in aiperf/common/hooks.py
330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 | |
on_request(*message_types)
¶
Decorator to specify that the function is a hook that should be called when requests of the
given type(s) are received from a ReplyClient.
See :func:aiperf.common.hooks._hook_decorator_for_message_types.
Example:
class MyService(RequestClientMixin, BaseComponentService):
@on_request(MessageType.CONVERSATION_REQUEST)
async def _handle_conversation_request(
self, message: ConversationRequestMessage
) -> ConversationResponseMessage:
return ConversationResponseMessage(
...
)
The above is the equivalent to setting:
MyService._handle_conversation_request.__aiperf_hook_type__ = AIPerfHook.ON_REQUEST
MyService._handle_conversation_request.__aiperf_hook_params__ = (MessageType.CONVERSATION_REQUEST,)
Source code in aiperf/common/hooks.py
353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 | |
on_start(func)
¶
Decorator to specify that the function is a hook that should be called during start.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(AIPerfLifecycleMixin):
@on_start
def _start_plugin(self) -> None:
pass
The above is the equivalent to setting:
MyPlugin._start_plugin.__aiperf_hook_type__ = AIPerfHook.ON_START
Source code in aiperf/common/hooks.py
244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 | |
on_state_change(func)
¶
Decorator to specify that the function is a hook that should be called during the service state change.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(AIPerfLifecycleMixin):
@on_state_change
def _on_state_change(self, old_state: LifecycleState, new_state: LifecycleState) -> None:
pass
The above is the equivalent to setting:
MyPlugin._on_state_change.__aiperf_hook_type__ = AIPerfHook.ON_STATE_CHANGE
Source code in aiperf/common/hooks.py
284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 | |
on_stop(func)
¶
Decorator to specify that the function is a hook that should be called during stop.
See :func:aiperf.common.hooks._hook_decorator.
Example:
class MyPlugin(AIPerfLifecycleMixin):
@on_stop
def _stop_plugin(self) -> None:
pass
The above is the equivalent to setting:
MyPlugin._stop_plugin.__aiperf_hook_type__ = AIPerfHook.ON_STOP
Source code in aiperf/common/hooks.py
264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 | |
provides_hooks(*hook_types)
¶
Decorator to specify that the class provides a hook of the given type to all of its subclasses.
Example:
@provides_hooks(AIPerfHook.ON_MESSAGE)
class MessageBusClientMixin(CommunicationMixin):
pass
The above is the equivalent to setting:
MessageBusClientMixin.__provides_hooks__ = {AIPerfHook.ON_MESSAGE}
Source code in aiperf/common/hooks.py
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 | |
aiperf.common.logging¶
MultiProcessLogHandler
¶
Bases: RichHandler
Custom logging handler that forwards log records to a multiprocessing queue.
Source code in aiperf/common/logging.py
169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | |
emit(record)
¶
Emit a log record to the queue.
Source code in aiperf/common/logging.py
179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | |
create_file_handler(log_folder, level)
¶
Configure a file handler for logging.
Source code in aiperf/common/logging.py
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 | |
get_global_log_queue()
cached
¶
Get the global log queue. Will create a new queue if it doesn't exist.
Source code in aiperf/common/logging.py
23 24 25 26 | |
setup_child_process_logging(log_queue=None, service_id=None, service_config=None, user_config=None)
¶
Set up logging for a child process to send logs to the main process.
This should be called early in child process initialization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
log_queue
|
Queue | None
|
The multiprocessing queue to send logs to. If None, tries to get the global queue. |
None
|
service_id
|
str | None
|
The ID of the service to log under. If None, logs will be under the process name. |
None
|
service_config
|
ServiceConfig | None
|
The service configuration used to determine the log level. |
None
|
user_config
|
UserConfig | None
|
The user configuration used to determine the log folder. |
None
|
Source code in aiperf/common/logging.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | |
setup_rich_logging(user_config, service_config)
¶
Set up rich logging with appropriate configuration.
Source code in aiperf/common/logging.py
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | |
aiperf.common.messages.base_messages¶
ErrorMessage
¶
Bases: Message
Message containing error data.
Source code in aiperf/common/messages/base_messages.py
114 115 116 117 118 119 | |
Message
¶
Bases: AIPerfBaseModel
Base message class for optimized message handling. Based on the AIPerfBaseModel class,
so it supports @exclude_if_none decorator. see :class:AIPerfBaseModel for more details.
This class provides a base for all messages, including common fields like message_type, request_ns, and request_id. It also supports optional field exclusion based on the @exclude_if_none decorator.
Each message model should inherit from this class, set the message_type field, and define its own additional fields.
Example:
@exclude_if_none("some_field")
class ExampleMessage(Message):
some_field: int | None = Field(default=None)
other_field: int = Field(default=1)
Source code in aiperf/common/messages/base_messages.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 | |
from_json(json_str)
classmethod
¶
Deserialize a message from a JSON string, attempting to auto-detect the message type.
NOTE: If you already know the message type, use the more performant :meth:from_json_with_type instead.
Source code in aiperf/common/messages/base_messages.py
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | |
from_json_with_type(message_type, json_str)
classmethod
¶
Deserialize a message from a JSON string with a specific message type.
NOTE: This is more performant than :meth:from_json because it does not need to
convert the JSON string to a dictionary first.
Source code in aiperf/common/messages/base_messages.py
87 88 89 90 91 92 93 94 95 96 97 98 | |
to_json()
¶
Fast serialization without full validation
Source code in aiperf/common/messages/base_messages.py
100 101 102 | |
RequiresRequestNSMixin
¶
Bases: Message
Mixin for messages that require a request_ns field.
Source code in aiperf/common/messages/base_messages.py
105 106 107 108 109 110 111 | |
aiperf.common.messages.command_messages¶
CommandMessage
¶
Bases: TargetedServiceMessage
Message containing command data. This message is sent by the system controller to a service to command it to do something.
Source code in aiperf/common/messages/command_messages.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
from_json(json_str)
classmethod
¶
Deserialize a command message from a JSON string, attempting to auto-detect the command type.
Source code in aiperf/common/messages/command_messages.py
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
CommandResponse
¶
Bases: TargetedServiceMessage
Message containing a command response.
Source code in aiperf/common/messages/command_messages.py
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | |
from_json(json_str)
classmethod
¶
Deserialize a command response message from a JSON string, attempting to auto-detect the command response type.
Source code in aiperf/common/messages/command_messages.py
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | |
CommandSuccessResponse
¶
Bases: CommandResponse
Generic command response message when a command succeeds. It should be subclassed for specific command types.
Source code in aiperf/common/messages/command_messages.py
190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 | |
DiscoverServicesCommand
¶
Bases: CommandMessage
Command message sent to request services to discover services.
Source code in aiperf/common/messages/command_messages.py
318 319 320 321 | |
ProcessRecordsCommand
¶
Bases: CommandMessage
Data to send with the process records command.
Source code in aiperf/common/messages/command_messages.py
286 287 288 289 290 291 292 293 294 | |
ProcessRecordsResponse
¶
Bases: CommandSuccessResponse
Response to the process records command.
Source code in aiperf/common/messages/command_messages.py
330 331 332 333 334 335 336 337 338 | |
ProfileCancelCommand
¶
Bases: CommandMessage
Command message sent to request services to cancel profiling.
Source code in aiperf/common/messages/command_messages.py
312 313 314 315 | |
ProfileConfigureCommand
¶
Bases: CommandMessage
Data to send with the profile configure command.
Source code in aiperf/common/messages/command_messages.py
297 298 299 300 301 302 303 | |
ProfileStartCommand
¶
Bases: CommandMessage
Command message sent to request services to start profiling.
Source code in aiperf/common/messages/command_messages.py
306 307 308 309 | |
ShutdownCommand
¶
Bases: CommandMessage
Command message sent to request a service to shutdown.
Source code in aiperf/common/messages/command_messages.py
324 325 326 327 | |
TargetedServiceMessage
¶
Bases: BaseServiceMessage
Message that can be targeted to a specific service by id or type.
If both target_service_type and target_service_id are None, the message is
sent to all services that are subscribed to the message type.
Source code in aiperf/common/messages/command_messages.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |
aiperf.common.messages.credit_messages¶
CreditDropMessage
¶
Bases: BaseServiceMessage
Message indicating that a credit has been dropped. This message is sent by the timing manager to workers to indicate that credit(s) have been dropped.
Source code in aiperf/common/messages/credit_messages.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | |
CreditPhaseCompleteMessage
¶
Bases: BaseServiceMessage
Message for credit phase complete. Sent by the TimingManager to report that a credit phase has completed.
Source code in aiperf/common/messages/credit_messages.py
104 105 106 107 108 109 110 111 112 113 114 115 116 117 | |
CreditPhaseProgressMessage
¶
Bases: BaseServiceMessage
Sent by the TimingManager to report the progress of a credit phase.
Source code in aiperf/common/messages/credit_messages.py
81 82 83 84 85 86 87 88 89 90 | |
CreditPhaseSendingCompleteMessage
¶
Bases: BaseServiceMessage
Message for credit phase sending complete. Sent by the TimingManager to report that a credit phase has completed sending.
Source code in aiperf/common/messages/credit_messages.py
93 94 95 96 97 98 99 100 101 | |
CreditPhaseStartMessage
¶
Bases: BaseServiceMessage
Message for credit phase start. Sent by the TimingManager to report that a credit phase has started.
Source code in aiperf/common/messages/credit_messages.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | |
CreditReturnMessage
¶
Bases: BaseServiceMessage
Message indicating that a credit has been returned. This message is sent by a worker to the timing manager to indicate that work has been completed.
Source code in aiperf/common/messages/credit_messages.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | |
CreditsCompleteMessage
¶
Bases: BaseServiceMessage
Credits complete message sent by the TimingManager to the System controller to signify all Credit Phases have been completed.
Source code in aiperf/common/messages/credit_messages.py
120 121 122 123 124 | |
aiperf.common.messages.dataset_messages¶
ConversationRequestMessage
¶
Bases: BaseServiceMessage
Message to request a full conversation by ID.
Source code in aiperf/common/messages/dataset_messages.py
12 13 14 15 16 17 18 19 20 21 22 23 | |
ConversationResponseMessage
¶
Bases: BaseServiceMessage
Message containing a full conversation.
Source code in aiperf/common/messages/dataset_messages.py
26 27 28 29 30 | |
ConversationTurnRequestMessage
¶
Bases: BaseServiceMessage
Message to request a single turn from a conversation.
Source code in aiperf/common/messages/dataset_messages.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
ConversationTurnResponseMessage
¶
Bases: BaseServiceMessage
Message containing a single turn from a conversation.
Source code in aiperf/common/messages/dataset_messages.py
49 50 51 52 53 54 | |
DatasetConfiguredNotification
¶
Bases: BaseServiceMessage
Notification sent to notify other services that the dataset has been configured.
Source code in aiperf/common/messages/dataset_messages.py
74 75 76 77 | |
DatasetTimingRequest
¶
Bases: BaseServiceMessage
Message for a dataset timing request.
Source code in aiperf/common/messages/dataset_messages.py
57 58 59 60 | |
DatasetTimingResponse
¶
Bases: BaseServiceMessage
Message for a dataset timing response.
Source code in aiperf/common/messages/dataset_messages.py
63 64 65 66 67 68 69 70 71 | |
aiperf.common.messages.health_messages¶
WorkerHealthMessage
¶
Bases: BaseServiceMessage
Message for a worker health check.
Source code in aiperf/common/messages/health_messages.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
completed_tasks
property
¶
The number of tasks that have been completed by the worker.
error_rate
property
¶
The error rate of the worker.
failed_tasks
property
¶
The number of tasks that have failed by the worker.
in_progress_tasks
property
¶
The number of tasks that are currently in progress by the worker.
total_tasks
property
¶
The total number of tasks that have been sent to the worker.
aiperf.common.messages.inference_messages¶
InferenceResultsMessage
¶
Bases: BaseServiceMessage
Message for a inference results.
Source code in aiperf/common/messages/inference_messages.py
20 21 22 23 24 25 26 27 | |
ParsedInferenceResultsMessage
¶
Bases: BaseServiceMessage
Message for a parsed inference results.
Source code in aiperf/common/messages/inference_messages.py
30 31 32 33 34 35 36 37 | |
aiperf.common.messages.progress_messages¶
AllRecordsReceivedMessage
¶
Bases: BaseServiceMessage, RequiresRequestNSMixin
This is sent by the RecordsManager to signal that all parsed records have been received, and the final processing stats are available.
Source code in aiperf/common/messages/progress_messages.py
99 100 101 102 103 104 105 | |
ProcessRecordsResultMessage
¶
Bases: BaseServiceMessage
Message for process records result.
Source code in aiperf/common/messages/progress_messages.py
108 109 110 111 112 113 114 115 | |
ProcessingStatsMessage
¶
Bases: BaseServiceMessage
Message for processing stats. Sent by the records manager to the system controller to report the stats of the profile run.
Source code in aiperf/common/messages/progress_messages.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | |
ProfileProgressMessage
¶
Bases: BaseServiceMessage
Message for profile progress. Sent by the timing manager to the system controller to report the progress of the profile run.
Source code in aiperf/common/messages/progress_messages.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | |
ProfileResultsMessage
¶
Bases: BaseServiceMessage
Message for profile results.
Source code in aiperf/common/messages/progress_messages.py
91 92 93 94 95 96 | |
RecordsProcessingStatsMessage
¶
Bases: BaseServiceMessage
Message for processing stats. Sent by the RecordsManager to report the stats of the profile run. This contains the stats for a single credit phase only.
Source code in aiperf/common/messages/progress_messages.py
75 76 77 78 79 80 81 82 83 84 85 86 87 88 | |
SweepProgressMessage
¶
Bases: BaseServiceMessage
Message for sweep progress.
Source code in aiperf/common/messages/progress_messages.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
aiperf.common.messages.service_messages¶
BaseServiceErrorMessage
¶
Bases: BaseServiceMessage
Base message containing error data.
Source code in aiperf/common/messages/service_messages.py
92 93 94 95 96 97 | |
BaseServiceMessage
¶
Bases: Message
Base message that is sent from a service. Requires a service_id field to specify the service that sent the message.
Source code in aiperf/common/messages/service_messages.py
21 22 23 24 25 26 27 28 | |
BaseStatusMessage
¶
Bases: BaseServiceMessage
Base message containing status data. This message is sent by a service to the system controller to report its status.
Source code in aiperf/common/messages/service_messages.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
HeartbeatMessage
¶
Bases: BaseStatusMessage
Message containing heartbeat data. This message is sent by a service to the system controller to indicate that it is still running.
Source code in aiperf/common/messages/service_messages.py
67 68 69 70 71 72 73 | |
NotificationMessage
¶
Bases: BaseServiceMessage
Message containing a notification from a service. This is used to notify other services of events.
Source code in aiperf/common/messages/service_messages.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 | |
RegistrationMessage
¶
Bases: BaseStatusMessage
Message containing registration data. This message is sent by a service to the system controller to register itself.
Source code in aiperf/common/messages/service_messages.py
59 60 61 62 63 64 | |
StatusMessage
¶
Bases: BaseStatusMessage
Message containing status data. This message is sent by a service to the system controller to report its status.
Source code in aiperf/common/messages/service_messages.py
51 52 53 54 55 56 | |
aiperf.common.mixins.aiperf_lifecycle_mixin¶
AIPerfLifecycleMixin
¶
Bases: TaskManagerMixin, HooksMixin
This mixin provides a lifecycle state machine, and is the basis for most components in the AIPerf framework. It provides a set of hooks that are run at each state transition, and the ability to define background tasks that are automatically ran on @on_start, and canceled via @on_stop.
It exposes to the outside world initialize, start, and stop methods, as well as getting the
current state of the lifecycle. These simple methods promote a simple interface for users to interact with.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 | |
is_running
property
¶
Whether the lifecycle's current state is LifecycleState.RUNNING.
stop_requested
property
writable
¶
Whether the lifecycle has been requested to stop.
__init__(id=None, **kwargs)
¶
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
id
|
str | None
|
The id of the lifecycle. If not provided, a random uuid will be generated. |
None
|
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
attach_child_lifecycle(child)
¶
Attach a child lifecycle to manage. This child will now have its lifecycle managed and controlled by this lifecycle. Common use cases are having a Service be a parent lifecycle, and having supporting components such as streaming post processors, progress reporters, etc. be children.
Children will be called in the order they were attached for initialize and start, and in reverse order for stop.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
241 242 243 244 245 246 247 248 249 250 251 252 253 254 | |
initialize()
async
¶
Initialize the lifecycle and run the @on_init hooks.
NOTE: It is generally discouraged from overriding this method. Instead, use the @on_init hook to handle your own initialization logic.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | |
initialize_and_start()
async
¶
Initialize and start the lifecycle. This is a convenience method that calls initialize and start in sequence.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
182 183 184 185 | |
start()
async
¶
Start the lifecycle and run the @on_start hooks.
NOTE: It is generally discouraged from overriding this method. Instead, use the @on_start hook to handle your own starting logic.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
stop()
async
¶
Stop the lifecycle and run the @on_stop hooks.
NOTE: It is generally discouraged from overriding this method. Instead, use the @on_stop hook to handle your own stopping logic.
Source code in aiperf/common/mixins/aiperf_lifecycle_mixin.py
187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 | |
aiperf.common.mixins.aiperf_logger_mixin¶
AIPerfLoggerMixin
¶
Bases: BaseMixin
Mixin to provide lazy evaluated logging for f-strings.
This mixin provides a logger with lazy evaluation support for f-strings, and direct log functions for all standard and custom logging levels.
see :class:AIPerfLogger for more details.
Usage
class MyClass(AIPerfLoggerMixin): def init(self): super().init() self.trace(lambda: f"Processing {item} of {count} ({item / count * 100}% complete)") self.info("Simple string message") self.debug(lambda i=i: f"Binding loop variable: {i}") self.warning("Warning message: %s", "legacy support") self.success("Benchmark completed successfully") self.notice("Warmup has completed") self.exception(f"Direct f-string usage: {e}")
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 | |
critical(message, *args, **kwargs)
¶
Log a critical message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
106 107 108 109 | |
debug(message, *args, **kwargs)
¶
Log a debug message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
71 72 73 74 | |
error(message, *args, **kwargs)
¶
Log an error message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
96 97 98 99 | |
exception(message, *args, **kwargs)
¶
Log an exception message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
101 102 103 104 | |
info(message, *args, **kwargs)
¶
Log an info message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
76 77 78 79 | |
log(level, message, *args, **kwargs)
¶
Log a message at a specified level with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
59 60 61 62 63 64 | |
notice(message, *args, **kwargs)
¶
Log a notice message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
81 82 83 84 | |
success(message, *args, **kwargs)
¶
Log a success message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
91 92 93 94 | |
trace(message, *args, **kwargs)
¶
Log a trace message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
66 67 68 69 | |
warning(message, *args, **kwargs)
¶
Log a warning message with lazy evaluation.
Source code in aiperf/common/mixins/aiperf_logger_mixin.py
86 87 88 89 | |
aiperf.common.mixins.base_mixin¶
BaseMixin
¶
Base mixin class.
This Mixin creates a contract that Mixins should always pass **kwargs to super().init, regardless of whether they extend another mixin or not.
This will ensure that the BaseMixin is the last mixin to have its init method called, which means that all other mixins will have a proper chain of init methods with the correct arguments and no accidental broken inheritance.
Source code in aiperf/common/mixins/base_mixin.py
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | |
aiperf.common.mixins.communication_mixin¶
CommunicationMixin
¶
Bases: AIPerfLifecycleMixin, ABC
Mixin to provide access to a CommunicationProtocol instance. This mixin should be inherited by any mixin that needs access to the communication layer to create Communication clients.
Source code in aiperf/common/mixins/communication_mixin.py
12 13 14 15 16 17 18 19 20 21 22 23 24 | |
aiperf.common.mixins.hooks_mixin¶
HooksMixin
¶
Bases: AIPerfLoggerMixin
Mixin for a class to be able to provide hooks to its subclasses, and to be able to run them. A "hook" is a function that is decorated with a hook type (AIPerfHook), and optional parameters.
In order to provide hooks, a class MUST use the @provides_hooks decorator to declare the hook types it provides.
Only list hook types that you call get_hooks or run_hooks on, to get or run the functions that are decorated
with those hook types.
Provided hooks are recursively inherited by subclasses, so if a base class provides a hook,
all subclasses will also provide that hook (without having to explicitly declare it, or call get_hooks or run_hooks).
In fact, you typically should not get or run hooks from the base class, as this may lead to calling hooks twice.
Hooks are registered in the order they are defined within the same class from top to bottom, and each class's hooks are inspected starting with hooks defined in the lowest level of base classes, moving up to the highest subclass.
IMPORTANT:
- Hook decorated methods from one class can be named the same as methods in their base classes, and BOTH will be registered.
Meaning if class A and class B both have a method named _initialize, which is decorated with @on_init, and class B inherits from class A,
then both _initialize methods will be registered as hooks, and will be run in the order A._initialize, then B._initialize.
This is done without requiring the user to call super()._initialize in the subclass, as the base class hook will be run automatically.
However, the caveat is that it is not possible to disable the hook from the base class without extra work, and if the user does accidentally
call super()._initialize in the subclass, the base class hook may be called twice.
Source code in aiperf/common/mixins/hooks_mixin.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |
for_each_hook_param(*hook_types, self_obj, param_type, lambda_func, reverse=False)
¶
Iterate over the hooks for the given hook type(s), optionally reversed. If a lambda_func is provided, it will be called for each parameter of the hook, and the hook and parameter will be passed as arguments.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hook_types
|
HookType
|
The hook types to iterate over. |
()
|
self_obj
|
Any
|
The object to pass to the lambda_func. |
required |
param_type
|
AnyT
|
The type of the parameter to pass to the lambda_func (for validation). |
required |
lambda_func
|
Callable[[Hook, AnyT], None]
|
The function to call for each hook. |
required |
reverse
|
bool
|
Whether to iterate over the hooks in reverse order. |
False
|
Source code in aiperf/common/mixins/hooks_mixin.py
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | |
get_hooks(*hook_types, reverse=False)
¶
Get the hooks that are defined by the class for the given hook type(s), optionally reversed. This will return a list of Hook objects that can be inspected for their type and parameters, and optionally called.
Source code in aiperf/common/mixins/hooks_mixin.py
85 86 87 88 89 90 91 92 93 94 95 96 97 | |
run_hooks(*hook_types, reverse=False, **kwargs)
async
¶
Run the hooks for the given hook type, waiting for each hook to complete before running the next one. Hooks are run in the order they are defined by the class, starting with hooks defined in the lowest level of base classes, moving up to the top level class. If more than one hook type is provided, the hooks from each level of classes will be run in the order of the hook types provided.
If reverse is True, the hooks will be run in reverse order. This is useful for stop/cleanup hooks, where you want to start with the children and ending with the parent.
The kwargs are passed through as keyword arguments to each hook.
Source code in aiperf/common/mixins/hooks_mixin.py
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 | |
aiperf.common.mixins.message_bus_mixin¶
MessageBusClientMixin
¶
Bases: CommunicationMixin, ABC
Mixin to provide message bus clients (pub and sub)for AIPerf components, as well as a hook to handle messages: @on_message.
Source code in aiperf/common/mixins/message_bus_mixin.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
publish(message)
async
¶
Publish a message. The message will be routed automatically based on the message type.
Source code in aiperf/common/mixins/message_bus_mixin.py
80 81 82 | |
subscribe(message_type, callback)
async
¶
Subscribe to a specific message type. The callback will be called when a message is received for the given message type.
Source code in aiperf/common/mixins/message_bus_mixin.py
59 60 61 62 63 64 65 66 | |
subscribe_all(message_callback_map)
async
¶
Subscribe to all message types in the map. The callback(s) will be called when a message is received for the given message type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message_callback_map
|
MessageCallbackMapT
|
A map of message types to callbacks. The callbacks can be a single callback or a list of callbacks. |
required |
Source code in aiperf/common/mixins/message_bus_mixin.py
68 69 70 71 72 73 74 75 76 77 78 | |
aiperf.common.mixins.process_health_mixin¶
ProcessHealthMixin
¶
Bases: BaseMixin
Mixin to provide process health information.
Source code in aiperf/common/mixins/process_health_mixin.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
get_process_health()
¶
Get the process health information for the current process.
Source code in aiperf/common/mixins/process_health_mixin.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
aiperf.common.mixins.pull_client_mixin¶
PullClientMixin
¶
Bases: CommunicationMixin, ABC
Mixin to provide a pull client for AIPerf components using a PullClient for the specified CommAddress. Add the @on_pull_message decorator to specify a function that will be called when a pull is received.
NOTE: This currently only supports a single pull client per service, as that is our current use case.
Source code in aiperf/common/mixins/pull_client_mixin.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | |
aiperf.common.mixins.reply_client_mixin¶
ReplyClientMixin
¶
Bases: CommunicationMixin, ABC
Mixin to provide a reply client for AIPerf components using a ReplyClient for the specified CommAddress. Add the @on_request decorator to specify a function that will be called when a request is received.
NOTE: This currently only supports a single reply client per service, as that is our current use case.
Source code in aiperf/common/mixins/reply_client_mixin.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | |
aiperf.common.mixins.task_manager_mixin¶
TaskManagerMixin
¶
Bases: AIPerfLoggerMixin
Mixin to manage a set of async tasks, and provide background task loop capabilities.
Can be used standalone, but it is most useful as part of the :class:AIPerfLifecycleMixin
mixin, where the lifecycle methods are automatically integrated with the task manager.
Source code in aiperf/common/mixins/task_manager_mixin.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | |
cancel_all_tasks(timeout=TASK_CANCEL_TIMEOUT_SHORT)
async
¶
Cancel all tasks in the set and wait for up to timeout seconds for them to complete.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
timeout
|
float
|
The timeout to wait for the tasks to complete. |
TASK_CANCEL_TIMEOUT_SHORT
|
Source code in aiperf/common/mixins/task_manager_mixin.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |
execute_async(coro)
¶
Create a task from a coroutine and add it to the set of tasks, and return immediately. The task will be automatically cleaned up when it completes.
Source code in aiperf/common/mixins/task_manager_mixin.py
25 26 27 28 29 30 31 32 | |
start_background_task(method, interval=None, immediate=False, stop_on_error=False, stop_event=None)
¶
Run a task in the background, in a loop until cancelled.
Source code in aiperf/common/mixins/task_manager_mixin.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 | |
wait_for_tasks()
async
¶
Wait for all current tasks to complete.
Source code in aiperf/common/mixins/task_manager_mixin.py
34 35 36 | |
aiperf.common.models.base_models¶
AIPerfBaseModel
¶
Bases: BaseModel
Base model for all AIPerf Pydantic models. This class is configured to allow arbitrary types to be used as fields as to allow for more flexible model definitions by end users without breaking the existing code.
The @exclude_if_none decorator can also be used to specify which fields should be excluded from the serialized model if they are None. This is a workaround for the fact that pydantic does not support specifying exclude_none on a per-field basis.
Source code in aiperf/common/models/base_models.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
exclude_if_none(*field_names)
¶
Decorator to set the _exclude_if_none_fields class attribute to the set of field names that should be excluded if they are None.
Source code in aiperf/common/models/base_models.py
10 11 12 13 14 15 16 17 18 19 20 21 22 | |
aiperf.common.models.credit_models¶
CreditPhaseConfig
¶
Bases: AIPerfBaseModel
Model for phase credit config. This is used by the TimingManager to configure the credit phases.
Source code in aiperf/common/models/credit_models.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | |
is_valid
property
¶
A phase config is valid if it is exactly one of the following: - is_time_based (expected_duration_sec is set and > 0) - is_request_count_based (total_expected_requests is set and > 0)
CreditPhaseStats
¶
Bases: CreditPhaseConfig
Model for phase credit stats. Extends the CreditPhaseConfig fields to track the progress of the credit phases. How many credits were dropped and how many were returned, as well as the progress percentage of the phase.
Source code in aiperf/common/models/credit_models.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | |
in_flight
property
¶
Calculate the number of in-flight credits (sent but not completed).
should_send
property
¶
Whether the phase should send more credits.
from_phase_config(phase_config)
classmethod
¶
Create a CreditPhaseStats from a CreditPhaseConfig. This is used to initialize the stats for a phase.
Source code in aiperf/common/models/credit_models.py
125 126 127 128 129 130 131 132 | |
PhaseProcessingStats
¶
Bases: AIPerfBaseModel
Model for phase processing stats. How many requests were processed and how many errors were encountered.
Source code in aiperf/common/models/credit_models.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 | |
total_records
property
¶
The total number of records processed successfully or in error.
aiperf.common.models.dataset_models¶
Conversation
¶
Bases: AIPerfBaseModel
A dataset representation of a full conversation.
A conversation is a sequence of turns between a user and an endpoint, and it contains the session ID and all the turns that consists the conversation.
Source code in aiperf/common/models/dataset_models.py
63 64 65 66 67 68 69 70 71 72 73 | |
Turn
¶
Bases: AIPerfBaseModel
A dataset representation of a single turn within a conversation.
A turn is a single interaction between a user and an AI assistant, and it contains timestamp, delay, and raw data that user sends in each turn.
Source code in aiperf/common/models/dataset_models.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | |
aiperf.common.models.error_models¶
ErrorDetails
¶
Bases: AIPerfBaseModel
Encapsulates details about an error.
Source code in aiperf/common/models/error_models.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
__eq__(other)
¶
Check if the error details are equal by comparing the code, type, and message.
Source code in aiperf/common/models/error_models.py
26 27 28 29 30 31 32 33 34 | |
__hash__()
¶
Hash the error details by hashing the code, type, and message.
Source code in aiperf/common/models/error_models.py
36 37 38 | |
from_exception(e)
classmethod
¶
Create an error details object from an exception.
Source code in aiperf/common/models/error_models.py
40 41 42 43 44 45 46 | |
ErrorDetailsCount
¶
Bases: AIPerfBaseModel
Count of error details.
Source code in aiperf/common/models/error_models.py
49 50 51 52 53 54 55 56 | |
aiperf.common.models.health_models¶
ProcessHealth
¶
Bases: AIPerfBaseModel
Model for process health data.
Source code in aiperf/common/models/health_models.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
aiperf.common.models.record_models¶
InferenceServerResponse
¶
Bases: AIPerfBaseModel
Response from a inference client.
Source code in aiperf/common/models/record_models.py
92 93 94 95 96 97 98 | |
MetricResult
¶
Bases: AIPerfBaseModel
The result values of a single metric.
Source code in aiperf/common/models/record_models.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
ParsedResponseRecord
¶
Bases: AIPerfBaseModel
Record of a request and its associated responses, already parsed and ready for metrics.
Source code in aiperf/common/models/record_models.py
330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 | |
end_perf_ns
cached
property
¶
Get the end time of the request in nanoseconds (perf_counter_ns). If request.end_perf_ns is not set, use the time of the last response. If there are no responses, use sys.maxsize.
has_error
cached
property
¶
Check if the response record has an error.
request_duration_ns
cached
property
¶
Get the duration of the request in nanoseconds.
start_perf_ns
cached
property
¶
Get the start time of the request in nanoseconds (perf_counter_ns).
timestamp_ns
cached
property
¶
Get the wall clock timestamp of the request in nanoseconds. DO NOT USE FOR LATENCY CALCULATIONS. (time.time_ns).
tokens_per_second
cached
property
¶
Get the number of tokens per second of the request.
valid
cached
property
¶
Check if the response record is valid.
Checks: - Request has no errors - Has at least one response - Start time is before the end time - Response timestamps are within valid ranges
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if the record is valid, False otherwise. |
ProcessRecordsResult
¶
Bases: AIPerfBaseModel
Result of the process records command.
Source code in aiperf/common/models/record_models.py
74 75 76 77 78 79 80 81 82 83 84 | |
RequestRecord
¶
Bases: AIPerfBaseModel
Record of a request with its associated responses.
Source code in aiperf/common/models/record_models.py
151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 | |
delayed
property
¶
Check if the request was delayed.
has_error
property
¶
Check if the request record has an error.
inter_token_latency_ns
property
¶
Get the interval between responses in nanoseconds.
time_to_first_response_ns
property
¶
Get the time to the first response in nanoseconds.
time_to_last_response_ns
property
¶
Get the time to the last response in nanoseconds.
time_to_second_response_ns
property
¶
Get the time to the second response in nanoseconds.
valid
property
¶
Check if the request record is valid by ensuring that the start time and response timestamps are within valid ranges.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if the record is valid, False otherwise. |
token_latency_ns(index)
¶
Get the latency of a token in nanoseconds.
Source code in aiperf/common/models/record_models.py
296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 | |
ResponseData
¶
Bases: AIPerfBaseModel
Base class for all response data.
Source code in aiperf/common/models/record_models.py
313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 | |
SSEField
¶
Bases: AIPerfBaseModel
Base model for a single field in an SSE message.
Source code in aiperf/common/models/record_models.py
114 115 116 117 118 119 120 121 122 123 124 | |
SSEMessage
¶
Bases: InferenceServerResponse
Individual SSE message from an SSE stream. Delimited by
.
Source code in aiperf/common/models/record_models.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | |
extract_data_content()
¶
Extract the data contents from the SSE message as a list of strings. Note that the SSE spec specifies that each data content should be combined and delimited by a single . We have left it as a list to allow the caller to decide how to handle the data.
Returns:
list[str]: A list of strings containing the data contents of the SSE message.
Source code in aiperf/common/models/record_models.py
136 137 138 139 140 141 142 143 144 145 146 147 148 | |
TextResponse
¶
Bases: InferenceServerResponse
Raw text response from a inference client including an optional content type.
Source code in aiperf/common/models/record_models.py
101 102 103 104 105 106 107 108 109 110 111 | |
aiperf.common.models.service_models¶
ServiceRunInfo
¶
Bases: BaseModel
Base model for tracking service run information.
Source code in aiperf/common/models/service_models.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | |
aiperf.common.models.worker_models¶
WorkerPhaseTaskStats
¶
Bases: AIPerfBaseModel
Stats for the tasks that have been sent to the worker for a given credit phase.
Source code in aiperf/common/models/worker_models.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | |
in_progress
property
¶
The number of tasks that are currently in progress.
This is the total number of tasks sent to the worker minus the number of failed and successfully completed tasks.
aiperf.common.protocols¶
AIPerfLifecycleProtocol
¶
Bases: TaskManagerProtocol, Protocol
Protocol for AIPerf lifecycle methods.
see :class:aiperf.common.mixins.aiperf_lifecycle_mixin.AIPerfLifecycleMixin for more details.
Source code in aiperf/common/protocols.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |
CommunicationProtocol
¶
Bases: AIPerfLifecycleProtocol, Protocol
Protocol for the base communication layer.
see :class:aiperf.common.comms.base_comms.BaseCommunication for more details.
Source code in aiperf/common/protocols.py
195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 | |
create_client(client_type, address, bind=False, socket_ops=None, max_pull_concurrency=None)
¶
Create a client for the given client type and address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
205 206 207 208 209 210 211 212 213 214 215 | |
create_pub_client(address, bind=False, socket_ops=None)
¶
Create a PUB client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
217 218 219 220 221 222 223 224 225 | |
create_pull_client(address, bind=False, socket_ops=None, max_pull_concurrency=None)
¶
Create a PULL client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
247 248 249 250 251 252 253 254 255 256 | |
create_push_client(address, bind=False, socket_ops=None)
¶
Create a PUSH client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
237 238 239 240 241 242 243 244 245 | |
create_reply_client(address, bind=False, socket_ops=None)
¶
Create a REPLY client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
268 269 270 271 272 273 274 275 276 | |
create_request_client(address, bind=False, socket_ops=None)
¶
Create a REQUEST client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
258 259 260 261 262 263 264 265 266 | |
create_sub_client(address, bind=False, socket_ops=None)
¶
Create a SUB client for the given address, which will be automatically started and stopped with the CommunicationProtocol instance.
Source code in aiperf/common/protocols.py
227 228 229 230 231 232 233 234 235 | |
DataExporterProtocol
¶
Bases: Protocol
Protocol for data exporters.
Any class implementing this protocol must provide an export method
that takes a list of Record objects and handles exporting them appropriately.
Source code in aiperf/common/protocols.py
294 295 296 297 298 299 300 301 302 | |
HooksProtocol
¶
Bases: Protocol
Protocol for hooks methods provided by the HooksMixin.
Source code in aiperf/common/protocols.py
305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 | |
get_hooks(*hook_types, reversed=False)
¶
Get the hooks for the given hook type(s), optionally reversed.
Source code in aiperf/common/protocols.py
309 310 311 | |
run_hooks(*hook_types, reversed=False, **kwargs)
async
¶
Run the hooks for the given hook type, waiting for each hook to complete before running the next one. If reversed is True, the hooks will be run in reverse order. This is useful for stop/cleanup starting with the children and ending with the parent.
Source code in aiperf/common/protocols.py
313 314 315 316 317 318 319 320 | |
InferenceClientProtocol
¶
Bases: Protocol
Protocol for an inference server client.
This protocol defines the methods that must be implemented by any inference server client implementation that is compatible with the AIPerf framework.
Source code in aiperf/common/protocols.py
323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 | |
__init__(model_endpoint)
¶
Create a new inference server client based on the provided configuration.
Source code in aiperf/common/protocols.py
331 332 333 | |
close()
async
¶
Close the client.
Source code in aiperf/common/protocols.py
356 357 358 | |
initialize()
async
¶
Initialize the inference server client in an asynchronous context.
Source code in aiperf/common/protocols.py
335 336 337 | |
send_request(model_endpoint, payload)
async
¶
Send a request to the inference server.
This method is used to send a request to the inference server.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_endpoint
|
ModelEndpointInfoT
|
The endpoint to send the request to. |
required |
payload
|
RequestInputT
|
The payload to send to the inference server. |
required |
Returns: The raw response from the inference server.
Source code in aiperf/common/protocols.py
339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 | |
MessageBusClientProtocol
¶
Bases: PubClientProtocol, SubClientProtocol, Protocol
A message bus client is a client that can publish and subscribe to messages on the event bus/message bus.
Source code in aiperf/common/protocols.py
279 280 281 282 283 284 285 286 | |
PostProcessorProtocol
¶
Bases: Protocol
PostProcessorProtocol is a protocol that defines the API for post-processors.
Source code in aiperf/common/protocols.py
361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 | |
post_process()
¶
Execute the post-processing logic on the records.
Source code in aiperf/common/protocols.py
371 372 373 374 375 | |
process_record(record)
¶
Process a single record.
Source code in aiperf/common/protocols.py
367 368 369 | |
RequestConverterProtocol
¶
Bases: Protocol
Protocol for a request converter that converts a raw request to a formatted request for the inference server.
Source code in aiperf/common/protocols.py
390 391 392 393 394 395 396 397 398 | |
format_payload(model_endpoint, turn)
async
¶
Format the turn for the inference server.
Source code in aiperf/common/protocols.py
394 395 396 397 398 | |
ResponseExtractorProtocol
¶
Bases: Protocol
Protocol for a response extractor that extracts the response data from a raw inference server response and converts it to a list of ResponseData objects.
Source code in aiperf/common/protocols.py
378 379 380 381 382 383 384 385 386 387 | |
extract_response_data(record, tokenizer)
async
¶
Extract the response data from a raw inference server response and convert it to a list of ResponseData objects.
Source code in aiperf/common/protocols.py
383 384 385 386 387 | |
ServiceManagerProtocol
¶
Bases: AIPerfLifecycleProtocol, Protocol
Protocol for a service manager that manages the running of services using the specific ServiceRunType.
Abstracts away the details of service deployment and management.
see :class:aiperf.controller.base_service_manager.BaseServiceManager for more details.
Source code in aiperf/common/protocols.py
401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 | |
ServiceProtocol
¶
Bases: MessageBusClientProtocol, Protocol
Protocol for a service. Essentially a MessageBusClientProtocol with a service_type and service_id attributes.
Source code in aiperf/common/protocols.py
447 448 449 450 451 452 453 454 455 456 457 458 459 460 | |
StreamingPostProcessorProtocol
¶
Bases: AIPerfLifecycleProtocol, Protocol
Protocol for a streaming post processor that streams the incoming records to the post processor.
Source code in aiperf/common/protocols.py
463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 | |
aiperf.common.tokenizer¶
Tokenizer
¶
This class provides a simplified interface for using Huggingface tokenizers, with default arguments for common operations.
Source code in aiperf/common/tokenizer.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | |
bos_token_id
property
¶
Return the beginning-of-sequence (BOS) token ID.
__call__(text, **kwargs)
¶
Call the underlying Huggingface tokenizer with default arguments, which can be overridden by kwargs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
The input text to tokenize. |
required |
Returns:
| Type | Description |
|---|---|
BatchEncoding
|
A BatchEncoding object containing the tokenized output. |
Source code in aiperf/common/tokenizer.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 | |
__init__()
¶
Initialize the tokenizer with default values for call, encode, and decode.
Source code in aiperf/common/tokenizer.py
28 29 30 31 32 33 34 35 | |
__repr__()
¶
Return a string representation of the underlying tokenizer.
Returns:
| Type | Description |
|---|---|
str
|
The string representation of the tokenizer. |
Source code in aiperf/common/tokenizer.py
119 120 121 122 123 124 125 126 | |
__str__()
¶
Return a user-friendly string representation of the underlying tokenizer.
Returns:
| Type | Description |
|---|---|
str
|
The string representation of the tokenizer. |
Source code in aiperf/common/tokenizer.py
128 129 130 131 132 133 134 135 | |
decode(token_ids, **kwargs)
¶
Decode a list of token IDs back into a string.
This method calls the underlying Huggingface tokenizer's decode method with default arguments, which can be overridden by kwargs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
token_ids
|
A list of token IDs to decode. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The decoded string. |
Source code in aiperf/common/tokenizer.py
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | |
encode(text, **kwargs)
¶
Encode the input text into a list of token IDs.
This method calls the underlying Huggingface tokenizer's encode method with default arguments, which can be overridden by kwargs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
The input text to encode. |
required |
Returns:
| Type | Description |
|---|---|
list[int]
|
A list of token IDs. |
Source code in aiperf/common/tokenizer.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | |
from_pretrained(name, trust_remote_code=False, revision='main')
classmethod
¶
Factory to load a tokenizer for the given pretrained model name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The name or path of the pretrained tokenizer model. |
required |
trust_remote_code
|
bool
|
Whether to trust remote code when loading the tokenizer. |
False
|
revision
|
str
|
The specific model version to use. |
'main'
|
Source code in aiperf/common/tokenizer.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | |
aiperf.common.types¶
This module defines common used alias types for AIPerf. This both helps prevent circular imports and helps with type hinting.
aiperf.common.utils¶
call_all_functions(funcs, *args, **kwargs)
async
¶
Call all functions in the list with the given name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obj
|
The object to call the functions on. |
required | |
func_names
|
The names of the functions to call. |
required | |
*args
|
The arguments to pass to the functions. |
()
|
|
**kwargs
|
The keyword arguments to pass to the functions. |
{}
|
Raises:
| Type | Description |
|---|---|
AIPerfMultiError
|
If any of the functions raise an exception. |
Source code in aiperf/common/utils.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
call_all_functions_self(self_, funcs, *args, **kwargs)
async
¶
Call all functions in the list with the given name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obj
|
The object to call the functions on. |
required | |
func_names
|
The names of the functions to call. |
required | |
*args
|
The arguments to pass to the functions. |
()
|
|
**kwargs
|
The keyword arguments to pass to the functions. |
{}
|
Raises:
| Type | Description |
|---|---|
AIPerfMultiError
|
If any of the functions raise an exception. |
Source code in aiperf/common/utils.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | |
load_json_str(json_str, func=lambda x: x)
¶
Deserializes JSON encoded string into Python object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
-
|
json_str
|
string JSON encoded string |
required |
-
|
func
|
callable A function that takes deserialized JSON object. This can be used to run validation checks on the object. Defaults to identity function. |
required |
Source code in aiperf/common/utils.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 | |
yield_to_event_loop()
async
¶
Yield to the event loop. This forces the current coroutine to yield and allow other coroutines to run, preventing starvation. Use this when you do not want to delay your coroutine via sleep, but still want to allow other coroutines to run if there is a potential for an infinite loop.
Source code in aiperf/common/utils.py
101 102 103 104 105 106 107 | |
aiperf.controller.base_service_manager¶
BaseServiceManager
¶
Bases: AIPerfLifecycleMixin, ABC
Base class for service managers. It provides a common interface for managing services.
Source code in aiperf/controller/base_service_manager.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | |
stop_services_by_type(service_types)
async
¶
Stop a set of services.
Source code in aiperf/controller/base_service_manager.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | |
aiperf.controller.kubernetes_service_manager¶
KubernetesServiceManager
¶
Bases: BaseServiceManager
Service Manager for starting and stopping services in a Kubernetes cluster.
Source code in aiperf/controller/kubernetes_service_manager.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
kill_all_services()
async
¶
Kill all required services as Kubernetes pods.
Source code in aiperf/controller/kubernetes_service_manager.py
63 64 65 66 67 68 69 | |
run_service(service_type, num_replicas=1)
async
¶
Run a service as a Kubernetes pod.
Source code in aiperf/controller/kubernetes_service_manager.py
45 46 47 48 49 50 51 52 53 | |
shutdown_all_services()
async
¶
Stop all required services as Kubernetes pods.
Source code in aiperf/controller/kubernetes_service_manager.py
55 56 57 58 59 60 61 | |
wait_for_all_services_registration(stop_event, timeout_seconds=DEFAULT_SERVICE_REGISTRATION_TIMEOUT)
async
¶
Wait for all required services to be registered in Kubernetes.
Source code in aiperf/controller/kubernetes_service_manager.py
71 72 73 74 75 76 77 78 79 80 81 82 83 | |
wait_for_all_services_start(stop_event, timeout_seconds=DEFAULT_SERVICE_START_TIMEOUT)
async
¶
Wait for all required services to be started in Kubernetes.
Source code in aiperf/controller/kubernetes_service_manager.py
85 86 87 88 89 90 91 92 93 94 95 96 97 | |
ServiceKubernetesRunInfo
¶
Bases: BaseModel
Information about a service running in a Kubernetes pod.
Source code in aiperf/controller/kubernetes_service_manager.py
21 22 23 24 25 26 | |
aiperf.controller.multiprocess_service_manager¶
MultiProcessRunInfo
¶
Bases: BaseModel
Information about a service running as a multiprocessing process.
Source code in aiperf/controller/multiprocess_service_manager.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 | |
MultiProcessServiceManager
¶
Bases: BaseServiceManager
Service Manager for starting and stopping services as multiprocessing processes.
Source code in aiperf/controller/multiprocess_service_manager.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 | |
kill_all_services()
async
¶
Kill all required services as multiprocessing processes.
Source code in aiperf/controller/multiprocess_service_manager.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 | |
run_service(service_type, num_replicas=1)
async
¶
Run a service with the given number of replicas.
Source code in aiperf/controller/multiprocess_service_manager.py
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
shutdown_all_services()
async
¶
Stop all required services as multiprocessing processes.
Source code in aiperf/controller/multiprocess_service_manager.py
114 115 116 117 118 119 120 121 122 | |
wait_for_all_services_registration(stop_event, timeout_seconds=DEFAULT_SERVICE_REGISTRATION_TIMEOUT)
async
¶
Wait for all required services to be registered.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stop_event
|
Event
|
Event to check if operation should be cancelled |
required |
timeout_seconds
|
float
|
Maximum time to wait in seconds |
DEFAULT_SERVICE_REGISTRATION_TIMEOUT
|
Source code in aiperf/controller/multiprocess_service_manager.py
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 | |
wait_for_all_services_start(stop_event, timeout_seconds=DEFAULT_SERVICE_START_TIMEOUT)
async
¶
Wait for all required services to be started.
Source code in aiperf/controller/multiprocess_service_manager.py
215 216 217 218 219 220 221 222 223 224 | |
aiperf.controller.proxy_manager¶
aiperf.controller.system_controller¶
SystemController
¶
Bases: SignalHandlerMixin, BaseService
System Controller service.
This service is responsible for managing the lifecycle of all other services. It will start, stop, and configure all other services.
Source code in aiperf/controller/system_controller.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 | |
initialize()
async
¶
We need to override the initialize method to run the proxy manager before the base service initialize. This is because the proxies need to be running before we can subscribe to the message bus.
Source code in aiperf/controller/system_controller.py
93 94 95 96 97 98 99 100 | |
main()
¶
Main entry point for the system controller.
Source code in aiperf/controller/system_controller.py
381 382 383 384 385 386 | |
aiperf.controller.system_mixins¶
SignalHandlerMixin
¶
Bases: AIPerfLoggerMixin
Mixin for services that need to handle system signals.
Source code in aiperf/controller/system_mixins.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | |
setup_signal_handlers(callback)
¶
This method will set up signal handlers for the SIGTERM and SIGINT signals in order to trigger a graceful shutdown of the service.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
callback
|
Callable[[int], Coroutine]
|
The callback to call when a signal is received |
required |
Source code in aiperf/controller/system_mixins.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | |
aiperf.dataset.composer.base¶
BaseDatasetComposer
¶
Bases: ABC
Source code in aiperf/dataset/composer/base.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | |
create_dataset()
abstractmethod
¶
Create a set of conversation objects from the given configuration.
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
list[Conversation]: A list of conversation objects. |
Source code in aiperf/dataset/composer/base.py
26 27 28 29 30 31 32 33 34 | |
aiperf.dataset.composer.custom¶
CustomDatasetComposer
¶
Bases: BaseDatasetComposer
Source code in aiperf/dataset/composer/custom.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | |
create_dataset()
¶
Create conversations from a file or directory.
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
list[Conversation]: A list of conversation objects. |
Source code in aiperf/dataset/composer/custom.py
21 22 23 24 25 26 27 28 29 30 31 32 33 | |
aiperf.dataset.composer.synthetic¶
SyntheticDatasetComposer
¶
Bases: BaseDatasetComposer
Source code in aiperf/dataset/composer/synthetic.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 | |
create_dataset()
¶
Create a synthetic conversation dataset from the given configuration.
It generates a set of conversations with a varying number of turns, where each turn contains synthetic text, image, and audio payloads.
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
list[Conversation]: A list of conversation objects. |
Source code in aiperf/dataset/composer/synthetic.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
aiperf.dataset.dataset_manager¶
DatasetManager
¶
Bases: ReplyClientMixin, BaseComponentService
The DatasetManager primary responsibility is to manage the data generation or acquisition. For synthetic generation, it contains the code to generate the prompts or tokens. It will have an API for dataset acquisition of a dataset if available in a remote repository or database.
Source code in aiperf/dataset/dataset_manager.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 | |
main()
¶
Main entry point for the dataset manager.
Source code in aiperf/dataset/dataset_manager.py
219 220 221 222 223 224 | |
aiperf.dataset.generator.audio¶
AudioGenerator
¶
Bases: BaseGenerator
A class for generating synthetic audio data.
This class provides methods to create audio samples with specified characteristics such as format (WAV, MP3), length, sampling rate, bit depth, and number of channels. It supports validation of audio parameters to ensure compatibility with chosen formats.
Source code in aiperf/dataset/generator/audio.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | |
generate(*args, **kwargs)
¶
Generate audio data with specified parameters.
Returns:
| Type | Description |
|---|---|
str
|
Data URI containing base64-encoded audio data with format specification |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If any of the following conditions are met: - audio length is less than 0.01 seconds - channels is not 1 (mono) or 2 (stereo) - sampling rate is not supported for MP3 format - bit depth is not supported (must be 8, 16, 24, or 32) - audio format is not supported (must be 'wav' or 'mp3') |
Source code in aiperf/dataset/generator/audio.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | |
aiperf.dataset.generator.base¶
BaseGenerator
¶
Bases: ABC
Abstract base class for all data generators.
Provides a consistent interface for generating synthetic data while allowing each generator type to use its own specific configuration and runtime parameters.
Source code in aiperf/dataset/generator/base.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | |
generate(*args, **kwargs)
abstractmethod
¶
Generate synthetic data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*args
|
Variable length argument list (subclass-specific) |
()
|
|
**kwargs
|
Arbitrary keyword arguments (subclass-specific) |
{}
|
Returns:
| Type | Description |
|---|---|
str
|
Generated data as a string (could be text, base64 encoded media, etc.) |
Source code in aiperf/dataset/generator/base.py
18 19 20 21 22 23 24 25 26 27 28 29 | |
aiperf.dataset.generator.image¶
ImageGenerator
¶
Bases: BaseGenerator
A class that generates images from source images.
This class provides methods to create synthetic images by resizing source images (located in the 'assets/source_images' directory) to specified dimensions and converting them to a chosen image format (e.g., PNG, JPEG). The dimensions can be randomized based on mean and standard deviation values.
Source code in aiperf/dataset/generator/image.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | |
generate(*args, **kwargs)
¶
Generate an image with the configured parameters.
Returns:
| Type | Description |
|---|---|
str
|
A base64 encoded string of the generated image. |
Source code in aiperf/dataset/generator/image.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | |
aiperf.dataset.generator.prompt¶
PromptGenerator
¶
Bases: BaseGenerator
A class for generating synthetic prompts from a text corpus.
This class loads a text corpus (e.g., Shakespearean text), tokenizes it, and uses the tokenized corpus to generate synthetic prompts of specified lengths. It supports generating prompts with a target number of tokens (with optional randomization around a mean and standard deviation) and can reuse previously generated token blocks to optimize generation for certain use cases. It also allows for the creation of a pool of prefix prompts that can be randomly selected.
Source code in aiperf/dataset/generator/prompt.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 | |
generate(mean=None, stddev=None, hash_ids=None)
¶
Generate a synthetic prompt with the configuration parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mean
|
int | None
|
The mean of the normal distribution. |
None
|
stddev
|
int | None
|
The standard deviation of the normal distribution. |
None
|
hash_ids
|
list[int] | None
|
A list of hash indices used for token reuse. |
None
|
Returns:
| Type | Description |
|---|---|
str
|
A synthetic prompt as a string. |
Source code in aiperf/dataset/generator/prompt.py
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 | |
get_random_prefix_prompt()
¶
Fetch a random prefix prompt from the pool.
Returns:
| Type | Description |
|---|---|
str
|
A random prefix prompt. |
Raises:
| Type | Description |
|---|---|
InvalidStateError
|
If the prefix prompts pool is empty. |
Source code in aiperf/dataset/generator/prompt.py
219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 | |
aiperf.dataset.loader.models¶
CustomData = Annotated[SingleTurn | MooncakeTrace | MultiTurn, Field(discriminator='type')]
module-attribute
¶
A union type of all custom data types.
MooncakeTrace
¶
Bases: AIPerfBaseModel
Defines the schema for Mooncake trace data.
See https://github.com/kvcache-ai/Mooncake for more details.
Example:
{"timestamp": 1000, "input_length": 10, "output_length": 4, "hash_ids": [123, 456]}
Source code in aiperf/dataset/loader/models.py
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 | |
MultiTurn
¶
Bases: AIPerfBaseModel
Defines the schema for multi-turn conversations.
The multi-turn custom dataset - supports multi-modal data (e.g. text, image, audio) - supports multi-turn features (e.g. delay, sessions, etc.) - supports client-side batching for each data (e.g. batch size > 1)
Source code in aiperf/dataset/loader/models.py
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
validate_turns_not_empty()
¶
Ensure at least one turn is provided
Source code in aiperf/dataset/loader/models.py
91 92 93 94 95 96 | |
RandomPool
¶
Bases: AIPerfBaseModel
Defines the schema for random pool data entry.
The random pool custom dataset - supports multi-modal data (e.g. text, image, audio) - supports client-side batching for each data (e.g. batch size > 1) - supports named fields for each modality (e.g. text_field_a, text_field_b, etc.) - DOES NOT support multi-turn or its features (e.g. delay, sessions, etc.)
Source code in aiperf/dataset/loader/models.py
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 | |
validate_at_least_one_modality()
¶
Ensure at least one modality is provided
Source code in aiperf/dataset/loader/models.py
138 139 140 141 142 143 144 145 | |
validate_mutually_exclusive_fields()
¶
Ensure mutually exclusive fields are not set together
Source code in aiperf/dataset/loader/models.py
127 128 129 130 131 132 133 134 135 136 | |
SingleTurn
¶
Bases: AIPerfBaseModel
Defines the schema for single-turn data.
User can use this format to quickly provide a custom single turn dataset. Each line in the file will be treated as a single turn conversation.
The single turn type - supports multi-modal (e.g. text, image, audio) - supports client-side batching for each data (e.g. batch_size > 1) - DOES NOT support multi-turn features (e.g. session_id)
Source code in aiperf/dataset/loader/models.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | |
validate_at_least_one_modality()
¶
Ensure at least one modality is provided
Source code in aiperf/dataset/loader/models.py
63 64 65 66 67 68 69 70 | |
validate_mutually_exclusive_fields()
¶
Ensure mutually exclusive fields are not set together
Source code in aiperf/dataset/loader/models.py
50 51 52 53 54 55 56 57 58 59 60 61 | |
aiperf.dataset.loader.mooncake_trace¶
MooncakeTraceDatasetLoader
¶
A dataset loader that loads Mooncake trace data from a file.
Loads Mooncake trace data from a file and converts the data into a list of conversations for dataset manager.
Each line in the file represents a single trace entry and will be converted to a separate conversation with a unique session ID.
Example: Fixed schedule version (Each line is a distinct session. Multi-turn is NOT supported)
{"timestamp": 1000, "input_length": 300, "output_length": 40, "hash_ids": [123, 456]}
Source code in aiperf/dataset/loader/mooncake_trace.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | |
convert_to_conversations(data)
¶
Convert all the Mooncake trace data to conversation objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, list[MooncakeTrace]]
|
A dictionary of session_id and list of Mooncake trace data. |
required |
Returns:
| Type | Description |
|---|---|
list[Conversation]
|
A list of conversations. |
Source code in aiperf/dataset/loader/mooncake_trace.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | |
load_dataset()
¶
Load Mooncake trace data from a file.
Returns:
| Type | Description |
|---|---|
dict[str, list[MooncakeTrace]]
|
A dictionary of session_id and list of Mooncake trace data. |
Source code in aiperf/dataset/loader/mooncake_trace.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | |
aiperf.dataset.loader.multi_turn¶
MultiTurnDatasetLoader
¶
A dataset loader that loads multi-turn data from a file.
The multi-turn type - supports multi-modal data (e.g. text, image, audio) - supports multi-turn features (e.g. delay, sessions, etc.) - supports client-side batching for each data (e.g. batch_size > 1)
NOTE: If the user specifies multiple multi-turn entries with same session ID, the loader will group them together. If the timestamps are specified, they will be sorted in ascending order later in the timing manager.
Examples: 1. Simple version
{
"session_id": "session_123",
"turns": [
{"text": "Hello", "image": "url", "delay": 0},
{"text": "Hi there", "delay": 1000}
]
}
- Batched version
{
"session_id": "session_123",
"turns": [
{"texts": ["Who are you?", "Hello world"], "images": ["/path/1.png", "/path/2.png"]},
{"texts": ["What is in the image?", "What is AI?"], "images": ["/path/3.png", "/path/4.png"]}
]
}
- Fixed schedule version
{
"session_id": "session_123",
"turns": [
{"timestamp": 0, "text": "What is deep learning?"},
{"timestamp": 1000, "text": "Who are you?"}
]
}
- Time delayed version
{
"session_id": "session_123",
"turns": [
{"delay": 0, "text": "What is deep learning?"},
{"delay": 1000, "text": "Who are you?"}
]
}
- full-featured version (multi-batch, multi-modal, multi-fielded, session-based, etc.)
{
"session_id": "session_123",
"turns": [
{
"timestamp": 1234,
"texts": [
{"name": "text_field_a", "contents": ["hello", "world"]},
{"name": "text_field_b", "contents": ["hi there"]}
],
"images": [
{"name": "image_field_a", "contents": ["/path/1.png", "/path/2.png"]},
{"name": "image_field_b", "contents": ["/path/3.png"]}
]
}
]
}
Source code in aiperf/dataset/loader/multi_turn.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 | |
load_dataset()
¶
Load multi-turn data from a JSONL file.
Each line represents a complete multi-turn conversation with its own session_id and multiple turns.
Returns:
| Type | Description |
|---|---|
dict[str, list[MultiTurn]]
|
A dictionary mapping session_id to list of CustomData (containing the MultiTurn). |
Source code in aiperf/dataset/loader/multi_turn.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 | |
aiperf.dataset.loader.protocol¶
aiperf.dataset.loader.random_pool¶
RandomPoolDatasetLoader
¶
A dataset loader that loads data from a single file or a directory.
Each line in the file represents single-turn conversation data, and files create individual pools for random sampling: - Single file: All lines form one single pool (to be randomly sampled from) - Directory: Each file becomes a separate pool, then pools are randomly sampled and merged into conversations later.
The random pool custom dataset - supports multi-modal data (e.g. text, image, audio) - supports client-side batching for each data (e.g. batch size > 1) - supports named fields for each modality (e.g. text_field_a, text_field_b, etc.) - DOES NOT support multi-turn or its features (e.g. delay, sessions, etc.)
Example:
- Single file
{"text": "Who are you?", "image": "/path/to/image1.png"}
{"text": "Explain what is the meaning of life.", "image": "/path/to/image2.png"}
...
The file will form a single pool of text and image data that will be used to generate conversations.
- Directory
Directory will be useful if user wants to - create multiple pools of different modalities separately (e.g. text, image) - specify different field names for the same modality.
data/queries.jsonl
{"texts": [{"name": "query", "contents": ["Who are you?"]}]}
{"texts": [{"name": "query", "contents": ["What is the meaning of life?"]}]}
...
data/passages.jsonl
{"texts": [{"name": "passage", "contents": ["I am a cat."]}]}
{"texts": [{"name": "passage", "contents": ["I am a dog."]}]}
...
The loader will create two separate pools for each file: queries and passages. Each pool is a text dataset with a different field name (e.g. query, passage), and loader will later sample from these two pools to create conversations.
Source code in aiperf/dataset/loader/random_pool.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | |
load_dataset()
¶
Load random pool data from a file or directory.
If filename is a file, reads and parses using the RandomPool model. If filename is a directory, reads each file in the directory and merges items with different modality names into combined RandomPool objects.
Returns:
| Type | Description |
|---|---|
dict[Filename, list[RandomPool]]
|
A dictionary mapping filename to list of RandomPool objects. |
Source code in aiperf/dataset/loader/random_pool.py
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | |
aiperf.dataset.loader.single_turn¶
SingleTurnDatasetLoader
¶
A dataset loader that loads single turn data from a file.
The single turn type - supports multi-modal data (e.g. text, image, audio) - supports client-side batching for each data (e.g. batch_size > 1) - DOES NOT support multi-turn features (e.g. delay, sessions, etc.)
Examples: 1. Single-batch, text only
{"text": "What is deep learning?"}
- Single-batch, multi-modal
{"text": "What is in the image?", "image": "/path/to/image.png"}
- Multi-batch, multi-modal
{"texts": ["Who are you?", "Hello world"], "images": ["/path/to/image.png", "/path/to/image2.png"]}
- Fixed schedule version
{"timestamp": 0, "text": "What is deep learning?"},
{"timestamp": 1000, "text": "Who are you?"},
{"timestamp": 2000, "text": "What is AI?"}
- Time delayed version
{"delay": 0, "text": "What is deep learning?"},
{"delay": 1234, "text": "Who are you?"}
- Full-featured version (Multi-batch, multi-modal, multi-fielded)
{
"texts": [
{"name": "text_field_A", "contents": ["Hello", "World"]},
{"name": "text_field_B", "contents": ["Hi there"]}
],
"images": [
{"name": "image_field_A", "contents": ["/path/1.png", "/path/2.png"]},
{"name": "image_field_B", "contents": ["/path/3.png"]}
]
}
Source code in aiperf/dataset/loader/single_turn.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | |
load_dataset()
¶
Load single-turn data from a JSONL file.
Each line represents a single turn conversation. Multiple turns with the same session_id (or generated UUID) are grouped together.
Returns:
| Type | Description |
|---|---|
dict[str, list[SingleTurn]]
|
A dictionary mapping session_id to list of CustomData. |
Source code in aiperf/dataset/loader/single_turn.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | |
aiperf.dataset.utils¶
check_file_exists(filename)
¶
Verifies that the file exists.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
The file path to verify. |
required |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
Source code in aiperf/dataset/utils.py
18 19 20 21 22 23 24 25 26 27 28 | |
encode_image(img, format)
¶
Encodes an image into base64 encoded string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
img
|
Image
|
The PIL Image object to encode. |
required |
format
|
str
|
The image format to use (e.g., "JPEG", "PNG"). |
required |
Returns:
| Type | Description |
|---|---|
str
|
A base64 encoded string representation of the image. |
Source code in aiperf/dataset/utils.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | |
load_json_str(json_str, func=lambda x: x)
¶
Deserializes JSON encoded string into Python object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
json_str
|
str
|
JSON encoded string |
required |
func
|
Callable
|
A function that takes deserialized JSON object. This can be used to run validation checks on the object. Defaults to identity function. |
lambda x: x
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
The deserialized JSON object. |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If the JSON string is invalid. |
Source code in aiperf/dataset/utils.py
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
open_image(filename)
¶
Opens an image file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
The file path to open. |
required |
Returns:
| Type | Description |
|---|---|
Image
|
The opened PIL Image object. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
Source code in aiperf/dataset/utils.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
sample_normal(mean, stddev, lower=-np.inf, upper=np.inf)
¶
Sample from a normal distribution with support for bounds using rejection sampling.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mean
|
float
|
The mean of the normal distribution. |
required |
stddev
|
float
|
The standard deviation of the normal distribution. |
required |
lower
|
float
|
The lower bound of the distribution. |
-inf
|
upper
|
float
|
The upper bound of the distribution. |
inf
|
Returns:
| Type | Description |
|---|---|
int
|
An integer sampled from the distribution. |
Source code in aiperf/dataset/utils.py
99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | |
sample_positive_normal(mean, stddev)
¶
Sample from a normal distribution ensuring positive values without distorting the distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mean
|
float
|
Mean value for the normal distribution |
required |
stddev
|
float
|
Standard deviation for the normal distribution |
required |
Returns:
| Type | Description |
|---|---|
float
|
A positive sample from the normal distribution |
Raises:
| Type | Description |
|---|---|
ValueError
|
If mean is less than 0 |
Source code in aiperf/dataset/utils.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 | |
sample_positive_normal_integer(mean, stddev)
¶
Sample a random positive integer from a normal distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mean
|
float
|
The mean of the normal distribution. |
required |
stddev
|
float
|
The standard deviation of the normal distribution. |
required |
Returns:
| Type | Description |
|---|---|
int
|
A positive integer sampled from the distribution. If the sampled |
int
|
number is less than 1, it returns 1. |
Source code in aiperf/dataset/utils.py
138 139 140 141 142 143 144 145 146 147 148 149 | |
aiperf.exporters.console_error_exporter¶
ConsoleErrorExporter
¶
A class that exports error data to the console
Source code in aiperf/exporters/console_error_exporter.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | |
aiperf.exporters.console_exporter¶
ConsoleExporter
¶
A class that exports data to the console
Source code in aiperf/exporters/console_exporter.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | |
aiperf.exporters.exporter_config¶
aiperf.exporters.exporter_manager¶
ExporterManager
¶
Bases: AIPerfLoggerMixin
ExporterManager is responsible for exporting records using all registered data exporters.
Source code in aiperf/exporters/exporter_manager.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
aiperf.exporters.json_exporter¶
JsonExportData
¶
Bases: BaseModel
Data to be exported to a JSON file.
Source code in aiperf/exporters/json_exporter.py
21 22 23 24 25 26 27 28 29 | |
JsonExporter
¶
Bases: AIPerfLoggerMixin
A class to export records to a JSON file.
Source code in aiperf/exporters/json_exporter.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | |
aiperf.module_loader¶
Module loader for AIPerf.
This module is used to load all modules into the system to ensure everything is registered and ready to be used. This is done to avoid the performance penalty of importing all modules during CLI startup, while still ensuring that all implementations are properly registered with their factories.
ensure_modules_loaded()
¶
Ensure all modules are loaded exactly once.
Source code in aiperf/module_loader.py
47 48 49 50 51 52 53 54 55 56 57 58 | |
aiperf.services.base_component_service¶
BaseComponentService
¶
Bases: BaseService
Base class for all Component services.
This class provides a common interface for all Component services in the AIPerf framework such as the Timing Manager, Dataset Manager, etc.
It extends the BaseService by adding heartbeat and registration functionality, as well as publishing the current state of the service to the system controller.
Source code in aiperf/services/base_component_service.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |
aiperf.services.base_service¶
BaseService
¶
Bases: MessageBusClientMixin, ABC
Base class for all AIPerf services, providing common functionality for communication, state management, and lifecycle operations. This class inherits from the MessageBusClientMixin, which provides the message bus client functionality.
This class provides the foundation for implementing the various services of the AIPerf system. Some of the abstract methods are implemented here, while others are still required to be implemented by derived classes.
Source code in aiperf/services/base_service.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 | |
service_type
class-attribute
¶
The type of service this class implements. This is set by the ServiceFactory.register decorator.
stop()
async
¶
This overrides the base class stop method to handle the case where the service is already stopping. In this case, we need to kill the process to be safe.
Source code in aiperf/services/base_service.py
167 168 169 170 171 172 173 174 | |
aiperf.services.inference_result_parser.inference_result_parser¶
InferenceResultParser
¶
Bases: PullClientMixin, BaseComponentService
InferenceResultParser is responsible for parsing the inference results and pushing them to the RecordsManager.
Source code in aiperf/services/inference_result_parser/inference_result_parser.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 | |
compute_input_token_count(record, tokenizer)
async
¶
Compute the number of tokens in the input for a given request record.
Source code in aiperf/services/inference_result_parser/inference_result_parser.py
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 | |
get_tokenizer(model)
async
¶
Get the tokenizer for a given model.
Source code in aiperf/services/inference_result_parser/inference_result_parser.py
94 95 96 97 98 99 100 101 102 103 | |
process_valid_record(message)
async
¶
Process a valid request record.
Source code in aiperf/services/inference_result_parser/inference_result_parser.py
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 | |
aiperf.services.inference_result_parser.openai_parsers¶
OpenAIObject
¶
Bases: CaseInsensitiveStrEnum
Types of OpenAI objects.
Source code in aiperf/services/inference_result_parser/openai_parsers.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 | |
parse(text)
classmethod
¶
Attempt to parse a string into an OpenAI object.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the object is invalid. |
Source code in aiperf/services/inference_result_parser/openai_parsers.py
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | |
parse_list(obj)
classmethod
¶
Attempt to parse a string into an OpenAI object from a list.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the object is invalid. |
Source code in aiperf/services/inference_result_parser/openai_parsers.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 | |
OpenAIResponseExtractor
¶
Extractor for OpenAI responses.
Source code in aiperf/services/inference_result_parser/openai_parsers.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 | |
__init__(model_endpoint)
¶
Create a new response extractor based on the provided configuration.
Source code in aiperf/services/inference_result_parser/openai_parsers.py
107 108 109 | |
extract_response_data(record, tokenizer)
async
¶
Extract the text from a server response message.
Source code in aiperf/services/inference_result_parser/openai_parsers.py
146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 | |
aiperf.services.records_manager.metrics.base_metric¶
BaseMetric
¶
Bases: ABC
Base class for all metrics with automatic subclass registration.
Source code in aiperf/services/records_manager/metrics/base_metric.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 | |
__init_subclass__(**kwargs)
¶
This method is called when a class is subclassed from Metric.
It automatically registers the subclass in the metric_interfaces
dictionary using the tag class attribute.
The tag attribute must be a non-empty string that uniquely identifies the
metric type. Only concrete (non-abstract) classes will be registered.
Source code in aiperf/services/records_manager/metrics/base_metric.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | |
get_all()
classmethod
¶
Returns the dictionary of all registered metric interfaces.
This method dynamically imports all metric type modules from the 'types' directory to ensure all metric classes are registered via init_subclass.
Returns:
| Type | Description |
|---|---|
dict[str, type[BaseMetric]]
|
dict[str, type[Metric]]: Mapping of metric tags to their corresponding classes |
Raises:
| Type | Description |
|---|---|
MetricTypeError
|
If there's an error importing metric type modules |
Source code in aiperf/services/records_manager/metrics/base_metric.py
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 | |
update_value(record=None, metrics=None)
abstractmethod
¶
Updates the metric value based on the provided record and dictionary of other metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
record
|
Optional[Record]
|
The record to update the metric with. |
None
|
metrics
|
Optional[dict[BaseMetric]]
|
A dictionary of other metrics that may be needed for calculation. |
None
|
Source code in aiperf/services/records_manager/metrics/base_metric.py
91 92 93 94 95 96 97 98 99 100 101 102 103 | |
values()
abstractmethod
¶
Returns the list of calculated metrics.
Source code in aiperf/services/records_manager/metrics/base_metric.py
105 106 107 108 109 | |
aiperf.services.records_manager.metrics.types.benchmark_duration_metric¶
BenchmarkDurationMetric
¶
Bases: BaseMetric
Post-processor for calculating the Benchmark Duration metric.
Source code in aiperf/services/records_manager/metrics/types/benchmark_duration_metric.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
values()
¶
Returns the BenchmarkDuration metric.
Source code in aiperf/services/records_manager/metrics/types/benchmark_duration_metric.py
35 36 37 38 39 | |
aiperf.services.records_manager.metrics.types.input_sequence_length_metric¶
InputSequenceLengthMetric
¶
Bases: BaseMetric
Post-processor for calculating Input Sequence Length (ISL) metrics from records.
Source code in aiperf/services/records_manager/metrics/types/input_sequence_length_metric.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |
values()
¶
Returns the list of Input Sequence Length (ISL) metrics.
Source code in aiperf/services/records_manager/metrics/types/input_sequence_length_metric.py
35 36 37 38 39 | |
aiperf.services.records_manager.metrics.types.inter_token_latency_metric¶
InterTokenLatencyMetric
¶
Bases: BaseMetric
Post Processor for calculating Inter Token Latency (ITL) metric.
Source code in aiperf/services/records_manager/metrics/types/inter_token_latency_metric.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | |
values()
¶
Returns the list of Inter Token Latency (ITL) metrics.
Source code in aiperf/services/records_manager/metrics/types/inter_token_latency_metric.py
49 50 51 52 53 | |
aiperf.services.records_manager.metrics.types.max_response_metric¶
MaxResponseMetric
¶
Bases: BaseMetric
Post-processor for calculating the maximum response time stamp metric from records.
Source code in aiperf/services/records_manager/metrics/types/max_response_metric.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | |
update_value(record=None, metrics=None)
¶
Adds a new record and calculates the maximum response timestamp metric.
Source code in aiperf/services/records_manager/metrics/types/max_response_metric.py
24 25 26 27 28 29 30 31 32 33 34 35 | |
values()
¶
Returns the Max Response Timestamp metric.
Source code in aiperf/services/records_manager/metrics/types/max_response_metric.py
37 38 39 40 41 | |
aiperf.services.records_manager.metrics.types.min_request_metric¶
MinRequestMetric
¶
Bases: BaseMetric
Post-processor for calculating the minimum request time stamp metric from records.
Source code in aiperf/services/records_manager/metrics/types/min_request_metric.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
update_value(record=None, metrics=None)
¶
Adds a new record and calculates the minimum request timestamp metric.
Source code in aiperf/services/records_manager/metrics/types/min_request_metric.py
24 25 26 27 28 29 30 31 32 33 34 35 | |
values()
¶
Returns the Minimum Request Timestamp metric.
Source code in aiperf/services/records_manager/metrics/types/min_request_metric.py
37 38 39 40 41 | |
aiperf.services.records_manager.metrics.types.output_sequence_length_metric¶
OutputSequenceLengthMetric
¶
Bases: BaseMetric
Post-processor for calculating Output Sequence Length (OSL) metrics from records.
Source code in aiperf/services/records_manager/metrics/types/output_sequence_length_metric.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
values()
¶
Returns the list of Output Sequence Length (OSL) metrics.
Source code in aiperf/services/records_manager/metrics/types/output_sequence_length_metric.py
34 35 36 37 38 | |
aiperf.services.records_manager.metrics.types.output_token_count_metric¶
OutputTokenCountMetric
¶
Bases: BaseMetric
Post Processor for calculating Output Token Count Metric.
Source code in aiperf/services/records_manager/metrics/types/output_token_count_metric.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | |
aiperf.services.records_manager.metrics.types.output_token_throughput_metric¶
OutputTokenThroughputMetric
¶
Bases: BaseMetric
Post Processor for calculating Output Token Throughput Metric.
Source code in aiperf/services/records_manager/metrics/types/output_token_throughput_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
values()
¶
Returns the OutputTokenThroughput metric.
Source code in aiperf/services/records_manager/metrics/types/output_token_throughput_metric.py
41 42 43 44 45 | |
aiperf.services.records_manager.metrics.types.output_token_throughput_per_user_metric¶
OutputTokenThroughputPerUserMetric
¶
Bases: BaseMetric
Post Processor for calculating Output Token Throughput per user metrics from records.
Source code in aiperf/services/records_manager/metrics/types/output_token_throughput_per_user_metric.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
values()
¶
Returns the list of Output Token Throughput Per User metrics.
Source code in aiperf/services/records_manager/metrics/types/output_token_throughput_per_user_metric.py
42 43 44 45 46 | |
aiperf.services.records_manager.metrics.types.request_count_metric¶
RequestCountMetric
¶
Bases: BaseMetric
Post-processor for counting the number of valid requests.
Source code in aiperf/services/records_manager/metrics/types/request_count_metric.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | |
values()
¶
Returns the Request Count metric.
Source code in aiperf/services/records_manager/metrics/types/request_count_metric.py
34 35 36 37 38 | |
aiperf.services.records_manager.metrics.types.request_latency_metric¶
RequestLatencyMetric
¶
Bases: BaseMetric
Post-processor for calculating Request Latency metrics from records.
Source code in aiperf/services/records_manager/metrics/types/request_latency_metric.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
update_value(record=None, metrics=None)
¶
Adds a new record and calculates the Request Latency metric.
This method extracts the request and last response timestamps, calculates the differences in time, and appends the result to the metric list.
Source code in aiperf/services/records_manager/metrics/types/request_latency_metric.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | |
values()
¶
Returns the list of Request Latency metrics.
Source code in aiperf/services/records_manager/metrics/types/request_latency_metric.py
41 42 43 44 45 | |
aiperf.services.records_manager.metrics.types.request_throughput_metric¶
RequestThroughputMetric
¶
Bases: BaseMetric
Post Processor for calculating Request throughput metrics from records.
Source code in aiperf/services/records_manager/metrics/types/request_throughput_metric.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
values()
¶
Returns the Request Throughput metric.
Source code in aiperf/services/records_manager/metrics/types/request_throughput_metric.py
36 37 38 39 40 | |
aiperf.services.records_manager.metrics.types.ttft_metric¶
TTFTMetric
¶
Bases: BaseMetric
Post-processor for calculating Time to First Token (TTFT) metrics from records.
Source code in aiperf/services/records_manager/metrics/types/ttft_metric.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
update_value(record=None, metrics=None)
¶
Adds a new record and calculates the Time To First Token (TTFT) metric.
This method extracts the timestamp from the request and the first response in the given RequestRecord object, computes the difference (TTFT), and appends the result to the metric list.
Source code in aiperf/services/records_manager/metrics/types/ttft_metric.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | |
values()
¶
Returns the list of Time to First Token (TTFT) metrics.
Source code in aiperf/services/records_manager/metrics/types/ttft_metric.py
42 43 44 45 46 | |
aiperf.services.records_manager.metrics.types.ttst_metric¶
TTSTMetric
¶
Bases: BaseMetric
Post-processor for calculating Time to Second Token (TTST) metrics from records.
Source code in aiperf/services/records_manager/metrics/types/ttst_metric.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | |
update_value(record=None, metrics=None)
¶
Adds a new record and calculates the Time To Second Token (TTST) metric.
This method extracts the timestamp from the first and second response in the given Record object, computes the difference (TTST), and appends the result to the metric list.
Source code in aiperf/services/records_manager/metrics/types/ttst_metric.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | |
values()
¶
Returns the list of Time to First Token (TTST) metrics.
Source code in aiperf/services/records_manager/metrics/types/ttst_metric.py
42 43 44 45 46 | |
aiperf.services.records_manager.post_processors.basic_metrics_streamer¶
BasicMetricsStreamer
¶
Bases: BaseStreamingPostProcessor
Streamer for basic metrics.
Source code in aiperf/services/records_manager/post_processors/basic_metrics_streamer.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | |
get_error_summary()
¶
Generate a summary of the error records.
Source code in aiperf/services/records_manager/post_processors/basic_metrics_streamer.py
58 59 60 61 62 63 | |
process_records(cancelled)
async
¶
Process the records.
This method is called when the records manager receives a command to process the records.
Source code in aiperf/services/records_manager/post_processors/basic_metrics_streamer.py
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | |
stream_record(record)
async
¶
Stream a record.
Source code in aiperf/services/records_manager/post_processors/basic_metrics_streamer.py
46 47 48 49 50 51 52 53 54 55 56 | |
aiperf.services.records_manager.post_processors.metric_summary¶
MetricSummary
¶
Bases: AIPerfLoggerMixin
MetricSummary is a post-processor that generates a summary of metrics from the records. It processes the records to extract relevant metrics and returns them in a structured format.
Source code in aiperf/services/records_manager/post_processors/metric_summary.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 | |
post_process()
¶
Classifies and computes metrics in dependency order to ensure correctness. The metrics are categorized based on their dependency types:
-
METRIC_OF_RECORDS:
- Computed for each individual record.
see: :meth:
process_record
- Computed for each individual record.
see: :meth:
-
METRIC_OF_BOTH:
- Computed for each individual record.
see: :meth:
process_record
- Computed for each individual record.
see: :meth:
-
METRIC_OF_METRICS:
- Computed based only on other metrics (not records).
- May depend on any combination of:
- METRIC_OF_RECORDS,
- METRIC_OF_BOTH,
- other METRIC_OF_METRICS (if dependency order is respected).
- Computed using a dependency-resolution loop.
This process ensures
- All metrics are computed exactly once, after dependencies are satisfied.
- Misconfigured or cyclic dependencies will raise an explicit runtime error.
Source code in aiperf/services/records_manager/post_processors/metric_summary.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 | |
process_record(record)
¶
Process a single record.
Classifies and computes metrics in dependency order to ensure correctness. The metrics are categorized based on their dependency types:
-
METRIC_OF_RECORDS:
- Depend solely on each individual record.
- Computed first, as they have no dependencies.
-
METRIC_OF_BOTH:
- Depend on both:
- the current record, and
- previously computed metrics (specifically, METRIC_OF_RECORDS).
- Computed after all METRIC_OF_RECORDS have been processed.
- Must not depend on other METRIC_OF_BOTH or METRIC_OF_METRICS.
- Depend on both:
-
METRIC_OF_METRICS: Computed once after all records have been processed. see: :meth:
post_process
Source code in aiperf/services/records_manager/post_processors/metric_summary.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
record_from_dataframe(df, metric)
¶
Create a Record from a DataFrame.
Source code in aiperf/services/records_manager/post_processors/metric_summary.py
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 | |
aiperf.services.records_manager.post_processors.processing_stats_streamer¶
ProcessingStatsStreamer
¶
Bases: BaseStreamingPostProcessor
This streamer is used to track the number of records processed and the number of errors. It is also used to track the number of requests expected and the number of requests completed. It will send a notification message when all expected requests have been received.
Source code in aiperf/services/records_manager/post_processors/processing_stats_streamer.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
publish_processing_stats()
async
¶
Publish the profile processing stats.
Source code in aiperf/services/records_manager/post_processors/processing_stats_streamer.py
111 112 113 114 115 116 117 118 119 120 | |
stream_record(record)
async
¶
Stream a record.
Source code in aiperf/services/records_manager/post_processors/processing_stats_streamer.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | |
aiperf.services.records_manager.post_processors.streaming_post_processor¶
BaseStreamingPostProcessor
¶
Bases: MessageBusClientMixin, ABC
BaseStreamingPostProcessor is a base class for all classes that wish to stream the incoming ParsedResponseRecords.
Source code in aiperf/services/records_manager/post_processors/streaming_post_processor.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
process_records(cancelled)
async
¶
Handle the process records command. This method is called when the records manager receives a command to process the records, and can be handled by the subclass. The results will be returned by the records manager to the caller.
Source code in aiperf/services/records_manager/post_processors/streaming_post_processor.py
79 80 81 82 83 84 | |
stream_record(record)
abstractmethod
async
¶
Handle the incoming record. This method should be implemented by the subclass.
Source code in aiperf/services/records_manager/post_processors/streaming_post_processor.py
72 73 74 75 76 77 | |
aiperf.services.records_manager.records_manager¶
RecordsManager
¶
Bases: PullClientMixin, BaseComponentService
The RecordsManager service is primarily responsible for holding the results returned from the workers.
Source code in aiperf/services/records_manager/records_manager.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 | |
main()
¶
Main entry point for the records manager.
Source code in aiperf/services/records_manager/records_manager.py
171 172 173 174 175 176 | |
aiperf.timing.concurrency_strategy¶
ConcurrencyStrategy
¶
Bases: CreditIssuingStrategy
Class for concurrency credit issuing strategy.
Source code in aiperf/timing/concurrency_strategy.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 | |
aiperf.timing.config¶
TimingManagerConfig
¶
Bases: AIPerfBaseModel
Configuration for the timing manager.
Source code in aiperf/timing/config.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
from_user_config(user_config)
classmethod
¶
Create a TimingManagerConfig from a UserConfig.
Source code in aiperf/timing/config.py
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
aiperf.timing.credit_issuing_strategy¶
CreditIssuingStrategy
¶
Bases: TaskManagerMixin, ABC
Base class for credit issuing strategies.
Source code in aiperf/timing/credit_issuing_strategy.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 | |
start()
async
¶
Start the credit issuing strategy. This will launch the progress reporting loop, the warmup phase (if applicable), and the profiling phase, all in the background.
Source code in aiperf/timing/credit_issuing_strategy.py
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
stop()
async
¶
Stop the credit issuing strategy.
Source code in aiperf/timing/credit_issuing_strategy.py
134 135 136 | |
CreditIssuingStrategyFactory
¶
Bases: AIPerfFactory[TimingMode, CreditIssuingStrategy]
Factory for creating credit issuing strategies based on the timing mode.
Source code in aiperf/timing/credit_issuing_strategy.py
194 195 | |
aiperf.timing.credit_manager¶
CreditManagerProtocol
¶
Bases: PubClientProtocol, Protocol
Defines the interface for a CreditManager.
This is used to allow the credit issuing strategy to interact with the TimingManager in a decoupled way.
Source code in aiperf/timing/credit_manager.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | |
CreditPhaseMessagesMixin
¶
Bases: MessageBusClientMixin, CreditPhaseMessagesRequirements
Mixin for services to implement the CreditManagerProtocol.
Requirements
This mixin must be used with a class that provides: - pub_client: PubClientProtocol - service_id: str
Source code in aiperf/timing/credit_manager.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | |
publish_credits_complete()
async
¶
Publish the credits complete message.
Source code in aiperf/timing/credit_manager.py
146 147 148 149 150 151 | |
publish_phase_complete(phase, completed, end_ns)
async
¶
Publish the phase complete message.
Source code in aiperf/timing/credit_manager.py
116 117 118 119 120 121 122 123 124 125 126 127 128 129 | |
publish_phase_sending_complete(phase, sent_end_ns)
async
¶
Publish the phase sending complete message.
Source code in aiperf/timing/credit_manager.py
102 103 104 105 106 107 108 109 110 111 112 113 114 | |
publish_phase_start(phase, start_ns, total_expected_requests, expected_duration_sec)
async
¶
Publish the phase start message.
Source code in aiperf/timing/credit_manager.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
publish_progress(phase, sent, completed)
async
¶
Publish the progress message.
Source code in aiperf/timing/credit_manager.py
131 132 133 134 135 136 137 138 139 140 141 142 143 144 | |
CreditPhaseMessagesRequirements
¶
Bases: AIPerfLoggerProtocol, Protocol
Requirements for the CreditPhaseMessagesMixin. This is the list of attributes that must be provided by the class that uses this mixin.
Source code in aiperf/timing/credit_manager.py
57 58 59 60 61 62 | |
aiperf.timing.fixed_schedule_strategy¶
FixedScheduleStrategy
¶
Bases: CreditIssuingStrategy
Class for fixed schedule credit issuing strategy.
Source code in aiperf/timing/fixed_schedule_strategy.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | |
aiperf.timing.request_rate_strategy¶
RequestRateStrategy
¶
Bases: CreditIssuingStrategy
Strategy for issuing credits based on a specified request rate.
Supports two modes: - CONSTANT: Issues credits at a constant rate with fixed intervals - POISSON: Issues credits using a Poisson process with exponentially distributed intervals
Source code in aiperf/timing/request_rate_strategy.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | |
aiperf.timing.timing_manager¶
TimingManager
¶
Bases: PullClientMixin, BaseComponentService, CreditPhaseMessagesMixin
The TimingManager service is responsible to generate the schedule and issuing timing credits for requests.
Source code in aiperf/timing/timing_manager.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 | |
drop_credit(credit_phase, conversation_id=None, credit_drop_ns=None)
async
¶
Drop a credit.
Source code in aiperf/timing/timing_manager.py
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 | |
main()
¶
Main entry point for the timing manager.
Source code in aiperf/timing/timing_manager.py
201 202 203 204 205 | |
aiperf.workers.credit_processor_mixin¶
CreditProcessorMixin
¶
Bases: CreditProcessorMixinRequirements
CreditProcessorMixin is a mixin that provides a method to process credit drops.
Source code in aiperf/workers/credit_processor_mixin.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 | |
CreditProcessorMixinRequirements
¶
Bases: AIPerfLoggerProtocol, Protocol
CreditProcessorMixinRequirements is a protocol that provides the requirements needed for the CreditProcessorMixin.
Source code in aiperf/workers/credit_processor_mixin.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | |
CreditProcessorProtocol
¶
Bases: Protocol
CreditProcessorProtocol is a protocol that provides a method to process credit drops.
Source code in aiperf/workers/credit_processor_mixin.py
30 31 32 33 34 35 36 37 38 | |
aiperf.workers.worker¶
Worker
¶
Bases: PullClientMixin, BaseComponentService, ProcessHealthMixin, CreditProcessorMixin
Worker is primarily responsible for making API calls to the inference server. It also manages the conversation between turns and returns the results to the Inference Results Parsers.
Source code in aiperf/workers/worker.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | |
aiperf.workers.worker_manager¶
WorkerManager
¶
Bases: BaseComponentService
The WorkerManager service is primary responsibility to manage the worker processes. It will spawn the workers, monitor their health, and stop them when the service is stopped. In the future it will also be responsible for the auto-scaling of the workers.
Source code in aiperf/workers/worker_manager.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | |
WorkerProcessInfo
¶
Bases: AIPerfBaseModel
Information about a worker process.
Source code in aiperf/workers/worker_manager.py
28 29 30 31 32 33 34 | |